I have multiple log files which contain values like this with headers :
I want to make a header file which contains each row from column 1 as individual column headers and min - max from each of the row and present it in column format.
Info in log files:
Trace Header Min Max Mean
aaa 1 6 xx
bbb 2 7 xxx
What I want :
aaa bbb
1-6 2-7
Thanks for help
Try this (the long listing is supposed to be in the data variable, read from a file or whatever):
foreach line [split $data \n] {
if {[scan $line {%s %d %d} header min max] eq 3} {
set result($header) $min-$max
}
}
% parray result
result(aaa) = 1-6
result(bbb) = 2-7
The scan command looks for three fields on each line, one text field and two decimal integer fields. A matching line reports three fields found, empty lines or lines with only text report less. If it finds a match, it is added to the result.
ETA:
To deal with the real-world log file you mentioned in a comment:
foreach line [split $data \n] {
if {[scan $line {%59[ #()-./0-9:=>A-Za-z]%s %d %d} header stuff min max] eq 4} {
set result([string trim $header]) $min-$max
}
}
(Note that duplicate headers are compacted into one in the array.)
If you have whitespace in a field, you can't consume the data with %s. Instead you can find out what kind of data the header might contain by using
% set chars [string map {\n {}} [join [lsort -unique [split $data {}]] {}]]
#()-./0123456789:=>ABCDEFGHILMNOPRSTUVWXY[]abcdefghijklmnopqrstuvwxyz
which is easy to simplify to the field specification
[ #()-./0-9:=>A-Za-z]
If you need to able to match brackets, put them in like this:
[][ #()-./0-9:=>A-Za-z]
To split at lines containing uppercase text and blanks, then only equal-signs and possibly more blanks up to line end,
package require textutil::split
::textutil::splitx $data {(?n)^[[:upper:] ]+=+\s*$}
Documentation:
eq (operator),
foreach,
if,
join,
lsort,
package,
parray,
regexp,
Syntax of Tcl regular expressions,
scan,
set,
split,
string,
textutil::split (package)
Code snippet:
set foo {
Info in log files:
Trace Header Min Max Mean
aaa 1 6 xx
bbb 2 7 xxx
}
set pattern {^(.*)\s+(\d+)\s+(\d+)\s+.*$}
set result [regexp -line -inline -all -- $pattern $foo]
array set bar {}
puts "Here's one view..."
foreach {all item min max} $result {
puts "$item $min-$max"
set bar([string trim $item]) $min-$max
}
puts ""
puts "Here's another one..."
puts [join [lsort [array names bar]] "\t"]
foreach item [lsort [array names bar]] {
puts -nonewline "$bar($item)\t"
}
Execution output:
Here's one view...
aaa 1-6
bbb 2-7
Here's another one...
aaa bbb
2-7 1-6
Related
I have a input file name "input.dat" with the values as:
7 0
9 9
0 2
2 1
3 4
4 6
5 7
5 6
And I want to add/subtract any number from column 2 by converting it into a list using Tcl Script. I have written the Tcl Script as follows:
set input [open "input.dat" r]
set data [read $input]
set values [list]
foreach line [split $data \n] {
if {$line eq ""} {break}
lappend values [lindex [split $line " "] 1]
}
puts "$values-2"
close $input
But the output comes out to be: 0 9 2 1 4 6 7 6-2
Can anybody help me, how to fix this problem ? or what is the error in the script ? It's also helpful if anybody can help me with a correct script.
I'm still not 100% sure what you want, but the options all seem to be solvable with the lmap command, which is for applying an operation to each element of a list.
Here's how to concatenate each element with -2:
set values [lmap val $values {
string cat $val "-2"
}]
Here's how to subtract 2 from each element:
set values [lmap val $values {
expr {$val - 2}
}]
puts will treat it as a string, you'll have to use [expr $val - 2]
NOTE: If it doesn't work, it is possible your input list is a string not int or float (Depends on how the values were read). In this case you can use:
scan $val %d tmp
set newval [expr $tmp - 2]
puts $newval
This will convert your string to int before applying mathematical expressions. You can similarly convert to float by using %f in scan instead of %d
I want to convert a column of a file in to list using Tcl Script. I have a file names "input.dat" with the data in two columns as follows:
7 0
9 9
0 2
2 1
3 4
And I want to convert the first column into a list and I wrote the Tcl Script as follows:
set input [open "input.dat" r]
set data [read $input]
set values [list]
foreach line [split $data \n] {
lappend values [lindex [split $line " "] 0]
}
puts "$values"
close $input
The result shows as: 7 9 0 2 3 {} {}
Now, my question is what is these two extra "{}" and what is the error in my script because of that it's producing two extra "{}" and How can I solve this problem?
Can anybody help me?
Those empty braces indicate empty strings. The file you used most probably had a couple empty lines at the end.
You could avoid this situation by checking a line before lappending the first column to the list of values:
foreach line [split $data \n] {
# if the line is not equal to blank, then lappend it
if {$line ne ""} {
lappend values [lindex [split $line " "] 0]
}
}
You can also remove those empty strings after getting the result list, but it would mean you'll be having two loops. Still can be useful if you cannot help it.
For example, using lsearch to get all the values that are not blank (probably simplest in this situation):
set values [lsearch -all -inline -not $values ""]
Or lmap to achieve the same (a bit more complex IMO but gives more flexibility when you have more complex situations):
set values [lmap n $values {if {$n != ""} {set n}}]
The first {} is caused by the blank line after 3 4.
The second {} is caused by a blank line which indicates end of file.
If the last blank line is removed from the file, then there will be only one {}.
If the loop is then coded in the following way, then there will be no {}.
foreach line [split $data \n] {
if { $line eq "" } { break }
lappend values [lindex [split $line " "] 0]
}
#jerry has a better solution
Unless intermittent empty strings carry some meaning important to your program's task, you may also use a transformation from a Tcl list (with empty-string elements) to a string that prunes empty-string elements (at the ends, and in-between):
concat {*}[split $data "\n"]
I have a csv file which has hostname and attached serial numbers. I want to create a key value pair with key being hostname and value being the list of serial numbers. The serial numbers can be one or many.
For example:
A, 1, 2, 3, 4
B, 5, 6
C, 7, 8, 9
D, 10
I need to access key A and get {1 2 3 4} as output. And if I access D i should get {10}
How should I do this? As the version of TCL i am using doesn't support any packages like CSV and I also won't be able to install it as it is in the server, So I am looking at a solution which doesn't include any packages.
For now, I am splitting the line with \n and then I process each element. Then I split the elements with "," and then I get the host name and serial numbers in a list. I then use the 0th index of the list as hostname and remaining values as serial numbers. Is there a cleaner solution?
I'd do something like:
#!/usr/bin/env tclsh
package require csv
package require struct::queue
set filename "file.csv"
set fh [open $filename r]
set q [struct::queue]
csv::read2queue $fh $q
close $fh
set data [dict create]
while {[$q size] > 0} {
set values [lassign [$q get] hostname]
dict set data $hostname [lmap elem $values {string trimleft $elem}]
}
dict for {key value} $data {
puts "$key => $value"
}
then
$ tclsh csv.tcl
A => 1 2 3 4
B => 5 6
C => 7 8 9
D => 10
The repeated recommendation given here is to use the CSV package for this purpose. See also the answer by #glenn-jackman. If unavailable, the time is better invested in obtaining it at your server side.
To get you started, however, you might want to adopt something along the lines of:
set dat {
A, 1, 2, 3, 4
B, 5, 6
C, 7, 8, 9
D, 10
}
set d [dict create]
foreach row [split [string trim $dat] \n] {
set row [lassign [split $row ,] key]
dict set d [string trim $key] [concat {*}$row]
}
dict get $d A
dict get $d D
Be warned, however, such hand-knitted solutions typically only serve their purpose when you have full control of the data being processed and its representation. Again, time is better invested by obtaining the CSV package.
I tried this way and got it working. Thanks again for your inputs. Yes, I know csv package would be easy but I cannot install it in server/product.
set multihost "host_slno.csv"
set fh1 [open $multihost r]
set data [read -nonewline $fh1]
close $fh1
set hostslnodata [ split $data "\n" ]
set hostslno [dict create];
foreach line $hostslnodata {
set line1 [join [split $line ", "] ]
puts "$line1"
if {[regexp {([A-Za-z0-9_\-]+)\s+(.*)} $line1 match hostname serial_numbers]} {
dict lappend hostslno $hostname $serial_numbers
}
}
puts [dict get $hostslno]
The sourcecode from the csv package is available. If you are unable to install the full csv package, you can include the code from here:
http://core.tcl.tk/tcllib/artifact/2898cd911697ecdb
If you still can't use that option, then stripping out all the whitespace and splitting on "," is required.
An alternative to the earlier answers is using string map:
set row [split [string map {" " ""} $row ] ,]
The string map will remove all spaces, and then split on ","
Once you have converted the lines of text into valid tcl lists:
A 1 2 3 4
B 5 6
C 7 8 9
D 10
Then you can use the lindex and lrange commands to pluck off all the pieces.
foreach row $data {
set server [lindex $row 0]
set serial_numbers [lrange $row 1 end]
dict set ...
One possibility:
set hostslno [dict create]
set multihost "host_slno.csv"
set fh1 [open $multihost]
while {[gets $fh line] >= 0} {
set numbers [lassign [regexp -inline -all {[^\s,]+} $line] hostname]
dict set hostslno $hostname $numbers
}
close $fh1
puts [dict get $hostslno A]
To be more precise:
I need to be looking into a file abc.txt which has contents something like this:
files/f1/atmp.c 98 100
files/f1/atmp1.c 89 100
files/f1/atmp2.c !! 75 100
files/f2/btmp.c 92 100
files/f2/btmp2.c !! 85 100
files/f3/xtmp.c 92 100
The script needs to find "!!" and use those lines to print out the following as output:
atmp2.c 75
btmp2.c 85
Any help?
this should do the trick.
set data {files/f1/atmp.c 98 100
files/f1/atmp1.c 89 100
files/f1/atmp2.c !! 75 100
files/f2/btmp.c 92 100
files/f2/btmp2.c !! 85 100
files/f3/xtmp.c 92 100}
set lines [split $data \n]
foreach line $lines {
set match [regexp {(\S+)\s+!!\s+(\d+)} $line -> file num]
if {$match} {puts "$file $num"}
}
Although regexp has a -all switch I don't think we can use it here as we only get the last match vars with -all
If your file isn't huge, you can slurp the whole thing into memory, split the lines into a TCL list, and then iterate through the list looking for a match. For example:
set fh [open foo]
set lines [read $fh]
close $fh
set lines [split $lines "\n"]
foreach line $lines {
if { [regexp {.*/(\S+\.c)\s*!!\s*(\d+)} $line match file data] } {
puts "$file $data"
}
}
This will successfully return just the lines with "!!" in them. With your posted corpus, the results are:
atmp2.c 75
btmp2.c 85
I might be tempted in this case to exec to awk:
set output [exec awk {$2 == "!!" {print $1, $3}} abc.txt]
puts $output
The trick is to combine the code that reads lines from the file with a regular expression that detects matching lines and extracts the relevant parts (a one-step process with regexp). The only tricky part is working out what exactly to use as the regular expression, so that you get exactly what you want. I'm going to guess that you're after the parts of the filenames after the /, that those filenames won't contain spaces, and that the number you're after is the entirety of the first digit sequence after the double exclamation. (Other formats are possible, some of which are easier to extract with other tools such as scan.) That would give us something like this:
set f [open abc.txt]
while {[gets $f line] >= 0} {
if {[regexp {([^\s/]+)\s+!!\s+(\d+)} $line -> name value]} {
# Or do whatever you want with these
puts "$name $value"
}
}
close $f
(The gets command with two arguments returns the length of line read, or -1 on failure. For normal files the only failure mode is EOF, so we can just terminate the loop when we get a negative value. Other kinds of channels can be more complex…)
A file has few words with numbers in the begining of them. i want to extract a particular no line.when given 1, it extracts line 1 also with 11, 21
FILE.txt has contents:
1.sample
lines of
2.sentences
present in
...
...
10.the
11.file
when Executed pro 1 file.txt
gives results from line 1,10 and also from line 11
as these three results have 1 in their string. i.e
Output of the script:
1.sample
10.the
11.file
Expected output: the output which i am expecting
is only line 1 contents and not the line 10 or line 11 contents.
i.e
Expected output:
1.sample
My current code:
proc pro { pattern args} {
set file [open $args r]
set lnum 0
set occ 0
while {[gets $file line] >=0} {
incr lnum
if {[regexp $pattern $line]} {
incr occ
puts "The pattern is present in line: $lnum"
puts "$line"
} else {
puts "not found"
}
}
puts "total number of occurencese : $occ"
close $file
}
the program is working fine but the thing is i am retrieving lines that i dont want to along with the expected line. As the number (1) which i want to retrieve is present in the other strings such as 11, 21, 14 etc these lines are also getting printed.
kindly tolerate my unclear way of explaining the question.
You can solve the problem using word boundaries as suggested by glen but you can also consider the following things:
If after every line number there is a . then you can use it as delimiter in regular expression
regexp "^$lineNo\\." $a
I would also suggest to use ^ (match at the beginning of line) so that even if number is present in the line elsewhere it would not get counted.
tcl word boundaries are well explained at http://www.regular-expressions.info/wordboundaries.html
You have to ensure your pattern matches only between word boundaries:
if {[regexp "\\m$pattern\\M" $line]} { ...
See the documentation for regular expression syntax.
If what you're looking to do is as constrained as what you're describing, why not just use something like
if { [string range $line 0 [string length $pattern]] eq "${pattern}." } {
...
}