Get line number using grep - tcl

I would like to get the line number using grep command, but I am getting the error message when search pattern is not a single word:
couldn't read file "Pattern": no such file or directory
How should be the proper usage of the grep? The code is here:
set status [catch {eval exec grep -n '$textToGrep' $fileName} lineNumber]
if { $status != 0 } {
#error
} else {
puts "lineNumber = $lineNumber"
}
Also if the search pattern is not matched at all, the returned value is : "child process exited abnormally"
Here is the simple test case:
set textToGrep "<BBB name=\"BBBRM\""
file contents:
<?xml version="1.0"?>
<!DOCTYPE AAA>
<AAA>
<BBB name="BBBRM" />
</AAA>

Well, I also get problems with your code and a single word pattern!
First of all, I don't think you need the eval command, because catch itself does an evaluation of its first argument.
Then, the problem is that you put the $textToGrep variable in exec inside single quotes ', which have no meaning to Tcl.
Therefore, if the content of textToGrep is foo, you are asking grep to search for the string 'foo'. If that string, including the single quotes, is not found in the file, you get the error.
Try to rewrite your first line with
set status [catch {exec grep -n $textToGrep $fileName} lineNumber]
and see if it works. Also, read the exec man page, which explains well these problems.

If your system has tcllib install, you can use the fileutil::grep command from the fileutil package:
package require fileutil
set fileName data.xml
set textToGrep {<BBB +name="BBBRM"}; # Update: Add + for multi-space match
set grepResult [::fileutil::grep $textToGrep $fileName]
foreach result $grepResult {
# Example result:
# data.xml:4: <BBB name="BBBRM" />
set lineNumber [lindex [split $result ":"] 1]
puts $lineNumber
# Update: Get the line, squeeze the spaces before name=
set line [lindex [split $result ":"] 2]
regsub { +name=} $line " name=" line
puts $line
}
Discussion
When assigning value to textToGrep, I used the curly braces, thus allowing double quote inside without having to escape them.
the result of the ::fileutil::grep command is a lits of strings. Each string contains the file name, line number, and the line itself; separated by colon.
One way to extract the line number is to first split the string (result) into pieces, using the colon as a separator. Next, I use lindex to grab the second item (index=1, since list is zero-base).
I have updated the code to account for case where there are multiple spaces before name=

There are two problems here:
Pattern matching does not work.
grep exits with error child process
exited abnormally when pattern is not found
The first problem is because you are not enclosing the textToGrep within double quotes(instead of single quotes). So your code should be:
[catch {exec grep -n "$textToGrep" $fileName} lineNumber]
Second problem is because of the exit status of grep command. grep exits with error when the pattern is not found. Here is the try on a shell:
# cat file
pattern
pattern with multiple spaces
# grep pattern file
pattern
pattern with multiple spaces
# echo $?
0
# grep nopattern file
# echo $?
1
EDIT:
In your case you have special characters such as < and > (which have special meaning on a shell).
set textToGrep "<BBB name=\"BBBRM\""
regsub -all -- {<} "$textToGrep" "\\\<" textToGrep
regsub -all -- {>} "$textToGrep" "\\\>" textToGrep

set textToGrep {\<BBB name="BBBRM"}
catch {exec grep -n $textToGrep $fileName} status
if {![regexp "child process" $status]} {
puts $status
} else {
puts "no word found"
}
I think you should do regular expression with child process. Just check above code if it works. In if statement you can process the status command as you like.
With the given example (in your post) the above code works only you need to use backslash for the "<" in the textToGrep variable

Related

Expect - avoid sending escape prompt sequences via ssh

The script is intended to retrieve the contents of some directory when it is getting full.
For development, the 'full' was set at 15%, the directory is /var/crash.
expect "#*" {
foreach part $full {
puts "part: $part"
set dir [split $part]
puts "dir: $dir [llength $dir]"
set d [lindex $dir 0]
puts "d: $d"
send -s -- "ls -lhS $d\n"
expect "#*" { puts "for $dir :: $expect_out(buffer)"}
}
}
send "exit\r"
The output of the script is:
part: /var/crash 15%
dir: {/var/crash} 15% 2
d: /var/crash
send: sending "ls -lhS \u001b[01;31m\u001b[K/var\u001b[m\u001b[K/crash\n" to { exp7 }
expect: does "" (spawn_id exp7) match glob pattern "#*"? no
expect: does "ls -lhS \u00071;31m\u0007/var\u0007\u0007/" (spawn_id exp7) match glob pattern "#*"? no
expect: does "ls -lhS \u00071;31m\u0007/var\u0007\u0007/crash\r\n" (spawn_id exp7) match glob pattern "#*"? no
As can be seen, although $d is /var/crash, when it is sent via ssh it becomes something like \u001b[01;31m\u001b[K/var\u001b[m\u001b[K/crash.
I cannot change the remote machine definitions for the command prompt.
How to get rid of these escape sequences that are sent?
Edit: Info about $full as requested
The proc analyze just tries to filter meaningful data.
proc analyze_df {cmd txt} {
set full [list]
set lines [split $txt \n]
foreach l $lines {
if {[string match $cmd* $l]} { continue }
set lcompact [regsub -all {\s+} $l " "]
set data [split $lcompact]
if {[string match 8?% [lindex $data 4]] \
|| [string match 9?% [lindex $data 4]] \
|| [string match 1??% [lindex $data 4]] \
|| [string match 5?% [lindex $data 4]] \
|| [string match 1?% [lindex $data 4]] } {
lappend full "[lindex $data 5] [lindex $data 4]"
}
}
return $full
}
The extract about the $full that was missing.
set command0 "df -h | grep /var"
send -- "$pass\r"
expect {
-nocase "denied*" {puts "$host denied"; continue}
-nocase "Authentication failed*" {puts "$host authentication failed"; continue}
"$*" {send -s -- "$command0\n"}
timeout {puts "$host TIMEOUT"; continue}
}
expect "$*" {puts "$host -> $expect_out(buffer)" }
set full [analyze_df $command0 $expect_out(buffer)]
Taking the suggestion received, perhaps it's grep that is adding the escape sequences, no?
You don't show how $full gets its value. But it must already have the escape codes. When printing $d those escape codes are interpreted by the terminal, so they may not be obvious. But Expect/Tcl definitely doesn't insert them. This is also confirmed by the braces around the first element when you print $dir. If this element was plain /var/crash, there would be no braces.
Your remark about the command prompt would suggest that $full may be taken from there. Maybe you cannot permanently change the remote machine's command prompt, but you should be able to change it for your session by setting the PS1 environment variable.
Another trick that may help in such situations is to do set env(TERM) dumb before spawning the ssh command. If the prompt (or other tools) correctly use the tput command to generate their escape codes, a dumb terminal will result in empty strings. This won't work if the escape codes are hard-coded for one specific TERM. But that's a bug on the remote side.
If you're absolutely stuck with that input data (and can't tell things to not mangle it with those ANSI terminal colour escape codes) then you can strip them out with:
set dir [split [regsub -all {\u001b[^a-zA-z]*[a-zA-Z]} $part ""]]
This makes use of the fact that the escape sequences start with the escape character (encoded as \u001b) and continue to the first ASCII letter. Replacing them all with the empty string should de-fang them cleanly.
You are recommended to try things like altering the TERM environment variable before calling spawn so that you don't have to do such cleaning. That tends to be easier than attempting to "clean up" the data after the fact.

Running a piped unix command from a Tcl script

I am trying to run the following:
exec tail CRON_GBOI_INC_AVG_COMPRESS_20140425_18* | grep -i "status of" | awk -F" " '{ print $NF }'
What it does is it tails the file, grep for the line containing the text status of, which will return a string and then return the last character in the string.
However, Tcl always throws the following error:
missing close-bracket or close-brace while compiling that line.
How can I change the code to do what I need to achieve? Is it at all possible with Tcl?
Tcl's syntax is not the shell's syntax. The conversion of that line would be:
exec tail {*}[glob CRON_GBOI_INC_AVG_COMPRESS_20140425_18*] | \
grep -i "status of" | awk "-F " {{ print $NF }}
That's to say, the globbing is explicit, the double quotes are round whole words, and the single quotes are changed to braces. (It's also broken over 2 lines with a backslash-newline sequence for clarity.)
Sirs,
As it happens, Tcl string handling may make this easier:-
set stringBack [exec tail [lindex $argv 0] | grep -i "[lindex $argv 1]" ]
set wanted [string index $stringBack end]
puts "stat chr is $wanted"
.
We run as, say,
./charget /path/to/file 'text hook we choose'
with file and text hook as parameters (quoting the text hook string for handoff to grep insensitive).

tcl error : extra characters after close-brace

having issues trying to debug this 'extra characters after close-brace' error. Error message points to my proc line ... I just can't see it for 2 days!
# {{{ MAIN PROGRAM
proc MAIN_PROGRAM { INPUT_GDS_OASIS_FILE L CELL_LIST_FILE } {
if { [file exists $CELL_LIST_FILE] == 0 } {
set celllist [$L cells]
} else {
set fp [open $CELL_LIST_FILE r]
set file_data [read $fp]
close $fp
set celllist [split $file_data "\n"]
set totalcells [expr [llength $celllist] - 1]
}
set counter 0
foreach cell $celllist {
set counter [expr {$counter + 1}]
set value [string length $cell]
set value3 [regexp {\$} $cell]
if { $value > 0 && $value2 == 0 && $value3 == 0 } {
# EXTRACT BOUNDRARY SIZE FIRST
puts "INFO -- READING Num : $counter/$totalcells -- $cell ..."
ONEIP_EXTRACT_BOUNDARY_SIZE $cell $L "IP_SIZE/$cell.txt"
exec gzip -f "IP_SIZE/$cell.txt"
}
}
# }}}
}
# }}}
This seems to be an unfortunate case of using braces in comments. The Tcl parser looks at braces before comments (http://tcl.tk/man/tcl8.5/TclCmd/Tcl.htm). It is a problem if putting braces in comments causes a mismatched number of open/close braces.
Try using a different commenting style, and remove the "{{{" and "}}}" from your comments.
I'm pretty sure that this is down to braces in comments within the proc body.
The wiki page here has a good explaination. In short a Tcl comment isn't like a comment most other languages and having unmatched braces in them leads to all
sorts of issues.
So the braces in the #}}} just before the end of the proc are probably the problem.
Tcl requires procedure bodies to be brace-balanced, even within comments.
OK, that's a total lie. Tcl really requires brace-quoted strings to be brace-balanced (Tcl's brace-quoted strings are just like single-quoted strings in bash, except they nest). The proc command just interprets its third argument as a script (used to define the procedure body) and it's very common to use brace-quoted strings for that sort of thing. This is a feature of Tcl's general syntax, and is why Tcl is very good indeed at handling things like DSLs.
You could instead do this:
proc brace-demo args "puts hi; # {{{"
brace-demo do it yeah
and that will work fine. Totally legal Tcl, and has a comment in a procedure body with unbalanced braces. It just happens that for virtually any real procedure, putting in all the required backslashes to stop interpretation of variable and command substitutions too soon is a total bear. Everyone uses braces for simplicity, and so has to balance them.
It's hardly ever a problem except occasionally for comments.

Expect : error can't read "ip": no such variable

I am a newbie in expect / TCL and trying to parse an HTML page that has output some thing like below:
<li><p>Timestamp: Wed, 14 Nov 2012 16:37:50 -0800
<li><p>Your IP address: 202.76.243.10</p></li>
<li><p class="XXX_no_wrap_overflow_hidden">Requested URL: /</p></li>
<li><p>Error reference number: 1003</p></li>
<li><p>Server ID: FL_23F7</p></li>
<li><p>Process ID: PID_1352939870.809-1-428432242</p></li>
<li><p>User-Agent: </p></li>
My script is below. I am able to get the web page which I am not able to parse the line "Your IP address:" which is giving me errors:
#!/usr/bin/expect -f
set timeout -1
spawn telnet www.whatismyip.com 80
send "GET /\r\n"
expect
set output $expect_out(buffer)
foreach line [split $output \n] {
regexp {.*<li><p>Your IP Address Is:.*?(\d+\.\d+\.\d+\.\d+)} $line ip
if {[string length ${ip}]} {
puts $ip
}
}
The error is:
Connection closed by foreign host.
can't read "ip": no such variable
while executing
"string length ${ip}"
("foreach" body line 3)
invoked from within
"foreach line [split $output \n] {
regexp {.*<li><p>Your IP Address Is:.*?(\d+\.\d+\.\d+\.\d+)} $line ip
if {[string length ${ip}]} {
..."
(file "./t4" line 7)
Any pointers where I am doing wrong?
The regular expression did not match, so the variable was not assigned. You should check the result of regexp to see if the match succeeded; when not using the -all option to regexp, you can treat it like a boolean. Try this:
foreach line [split $output \n] {
if {[regexp {<li><p>Your IP Address Is:.*?(\d+\.\d+\.\d+\.\d+)(?!\d)} $line -> ip]} {
puts $ip
}
}
The -> is really a (weird!) variable name which will hold the whole matched string; we're not interested in it (just the parenthetical part) so we use the non-alphabetic to mnemonically say “this is going to there” (the submatch to the ip variable).
Your line contains "address" (lowercase) but you're trying to match "Address" (uppercase). Add the
-nocase option to the regexp command. Also, Tcl regular expressions cannot have mixed greediness -- the first quantifier determines if the whole expression is greedy or non-greedy (I can't find where this is documented right now).
regexp -nocase {IP Address.*(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})} $line -> ip
If your ultimate goal is to get your host's external IP, then go with an API solution, such as one from exip.org:
#!/usr/bin/env tclsh
set api http://api-nyc01.exip.org/?call=ip
if {[catch {exec curl --silent $api} output]} {
puts "Failed to acquire external IP"
} else {
puts "My external IP is $output"
}
Please visit their API site for more information, especially if you live outside the USA. This solution requires curl, which you might need to install.

TCL: Check file existance by SHELL environment variable (another one)

I have a file contain lines with path to the files. Sometimes a path contain SHELL environment variable and I want to check the file existence.
The following is my solution:
set fh [open "the_file_contain_path" "r"]
while {![eof $fh]} {
set line [gets $fh]
if {[regexp -- {\$\S+} $line]} {
catch {exec /usr/local/bin/tcsh -c "echo $line" } line
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}
}
I sure there is more elegant solution without using
/usr/local/bin/tcsh -c
You can capture the variable name in the regexp command and do a lookup in Tcl's global env array. Also, your use of eof as the while condition means your loop will interate one time too many (see http://phaseit.net/claird/comp.lang.tcl/fmm.html#eof)
set fh [open "the_file_contain_path" "r"]
while {[gets $fh line] != -1} {
# this can handle "$FOO/bar/$BAZ"
if {[string first {$} $line] != -1} {
regsub -all {(\$)(\w+)} $line {\1::env(\2)} new
set line [subst -nocommand -nobackslashes $new]
}
if {![file exists $line]} {
puts "ERROR: the file $line does not exist"
}
}
First off, it's usually easier (for small files, say of no more than 1–2MB) to read in the whole file and split it into lines instead of using gets and eof in a while loop. (The split command is very fast.)
Secondly, to do the replacement you need the place in the string to replace, so you use regexp -indices. That does mean that you need to take a little more complex approach to doing the replacement, with string range and string replace to do some of the work. Assuming you're using Tcl 8.5…
set fh [open "the_file_contain_path" "r"]
foreach line [split [read $fh] "\n"] {
# Find a replacement while there are any to do
while {[regexp -indices {\$(\w+)} $line matchRange nameRange]} {
# Get what to replace with (without any errors, just like tcsh)
set replacement {}
catch {set replacement $::env([string range $line {*}$nameRange])}
# Do the replacement
set line [string replace $line {*}$matchRange $replacement]
}
# Your test on the result
if {![file exists $line]} {
puts "ERROR: the file $line is not exists"
}
}
TCL programs can read environment variables using the built-in global variable env. Read the line, look for $ followed by a name, look up $::env($name), and substitute it for the variable.
Using the shell for this is very bad if the file is supplied by untrusted users. What if they put ; rm * in the file? And if you're going to use a shell, you should at least use sh or bash, not tcsh.