Using backslash-newline sequence in Tcl - tcl

In Tcl, we are using the backslash for escaping special characters as well as for spreading long commands across multiple lines.
For example, a typical if loop can be written as
set some_Variable_here 1
if { $some_Variable_here == 1 } {
puts "it is equal to 1"
} else {
puts "it is not equal to 1"
}
With the help of backslash, it can be written as follows too
set some_Variable_here 1
if { $some_Variable_here == 1 } \
{
puts "it is equal to 1"
} \
else {
puts "it is not equal to 1"
}
So, with backslash we can make the statements to be treated as if like they are in the same line.
Lets consider the set statement
I can write something like as below
set x Albert\ Einstein;# This works
puts $x
#This one is not working
set y Albert\
Einstein
If I try with double quotes or braces, then the above one will work. So, is it possible to escape the newline with backslashes without double quotes or braces?

A backslash-newline-whitespace* sequence (i.e., following whitespace is skipped over) is always replaced with a single space. To get a backslash followed by a newline in the resulting string, use \\ followed by \n instead.
set y Albert\\\nEinstein

Related

Tcl: Parsing input with strings in quotes

I have the following code to split stdin into a list of strings:
set cmd [string toupper [gets stdin]]
set items [split $cmd " "]
This splits the user input into a list (items) using the space as a delimiter. It works fine for simple input such as:
HELLO 1 2 3
What I get in items:
HELLO
1
2
But how can I get the quoted string in the example below to be become one item in the list (items):
"HELLO THERE" 1 2 3
What I want in items:
HELLO THERE
1
2
How can I do this?
This is where you get into building a more complex parser. The first step towards that is switching to using regular expressions.
regexp -all -inline {"[^\"]*"|[^\"\s]+} $inputData
That will do the right thing... provided the input is well-formed and only uses double quotes for quoting. It also doesn't strip the quotes off the outside of the "words"; you'll want to use string trim $word \" to clean that up.
If this is a command that you are parsing, use a safe interpreter. Then you can allow Tcl syntax to be used without exposing the guts of your code. I'm pretty sure there are answers here on how to do that already.
Because Tcl doesn't have strong types, the simplest way to do this is to just treat your stdin string like a list of strings. No need to use split to convert a string into a list.
set cmd {"HELLO THERE" 1 2 3}
foreach item $cmd {
puts $item
}
--> HELLO THERE
1
2
3
Use string is list to check if your $cmd string can be treated as a list.
if {[string is list $cmd]} {
puts "Can be a list"
} else {
puts "Cannot be a list"
}

catch multiple empty lines in file in tcl

There are 4 empty space in my file,set in wr_fp.I want to catch four empty space in code. But below code is not working.
while {[gets $wr_fp line3] >= 0} {
if {[regexp "\n\s+\n\s+\n\s+\n" $line3]} { puts "found 4 empty lines"}
}
tl;dr: Don't put REs in "quotes", put them in {braces}.
The problem is that you've put your RE in quotes, so that it is actually this:
s+
s+
s+
Because of Tcl's general substitution rules, \n becomes a newline and \s becomes a simple s. Putting the RE in braces inhibits this (unwanted in this case) behaviour.
this is my answer.I want this.
while {[gets $rd_fp line] >= 0} {
if {[string match "" $line]} {
if {[expr $count % 4] == 1} {puts "found 4 space"}
incr count
}
}
The gets / chan gets command reads one line at a time and discards the newline character from each line, so your test will never succeed. You need to read in the full contents of the file at once:
set txt [chan read $wr_fp]
if {[regexp {\n\s+\n\s+\n\s+\n} $txt]} { puts "found 4 empty lines"}
Note that you need to use braces around the regular expression as Donal explains.
On some typical pitfalls of RE formulation:
do you really intend to specify that there must be at least one whitespace character on each 'empty' line? If you want to allow lines with no characters at all between the newlines, use \s* instead of \s+.
Also note that this regular expression will match ranges with more than four newlines: the extra newlines will be consumed by one of the \s+ groups. If you want to disallow extra newlines, match with (e.g.) [ \t\f\r] (or any other combination of whitespace you want) instead of \s. Note that this means the expression will match exactly three lines with nothing but blanks, tabs, form feeds, and returns, the lines surrounded and separated by newlines: you might want to extend it with one more subgroup to match the fourth line.
I'm a bit mystified by your solution as described in your own answer, since it doesn't do what was specified in the question. With the following text file:
abc
def
ghi
jkl
mno
pqr
stu
vwx
yz.
(where there is a tab character in the second line after "pqr")
and assuming count has the value 0 when the code is called, your code outputs "found 4 space" after reading the blank lines after "def", "pqr", and "vwx", but not after the line before "stu", where your question indicated it should be.
This code
set count 0
while {[gets $rd_fp line] >= 0} {
if {[string is space $line]} {
incr count
if {$count == 4} {puts "found 4 space"}
} else {
set count 0
}
}
does do what you asked for (nearly): it accepts lines containing whitespace as empty, and it prints its message only after finding four consecutive empty lines. The major difference from the specification in your question is that it also accepts lines without any characters as empty. To match your specification, string is space -strict $line should be used instead.
Documentation: chan, gets, if, incr, puts, regexp, set, string, while

How to match a colon after a close bracket

Why does the following not match the :
expect {
timeout {puts timedout\n}
\[1\]: {puts matched\n}
}
> expect test.tcl
[1]:
timedout
If I change it and remove the colon the match works:
expect {
timeout {puts timedout\n}
\[1\] {puts matched\n}
}
$ expect test.tcl
[1]
matched
Or if I get rid of the 1st bracket
expect {
timeout {puts timedout\n}
1\]: {puts matched\n}
}
then it matches:
$ expect test.tcl
1]:
matched
It is not the problem with :, but with [.
The [ is special to both Tcl and the Expect pattern matcher so it is particularly messy. To match a literal [, you have to backslash once from Tcl and then again so that it is not treated as a range during pattern matching. The first backslash, of course, has to be backslashed to prevent it from turning the next backslash into a literal backslash!
expect "\\\[" ; #matches literal '['
So, your code should be,
expect {
timeout {puts timedout\n}
\\\[1]: {puts matched\n}
}
You can prefix the ] with a backslash if it makes you feel good, but it is not
necessary. Since there is no matching left-hand bracket to be matched within the
double-quoted string, nothing special happens with the right-hand bracket. It stands for itself and is passed on to the Expect command, where it is then interpreted as the end of the range.
The next set of examples shows the behavior of [ as a pattern preceded by differing numbers of backslashes. If the [ is not prefixed by a backslash, Tcl interprets whatever follows as a command. For these examples, imagine that there is a procedure named XY that returns the string n*w.
expect" [XY]" ; # matches n followed by anything
expect "\[XY]" ; # matches X or Y
expect "\\[XY]" ; # matches n followed by anything followed by w
expect "\\\[XY]" ; # matches [XYl
expect "\\\\[XY]" ; # matches \ followed by n followed ...
expect "\\\\\[XY]" ; # matches sequence of \ and X or Y
The \\[XY] case deserves close scrutiny. Tcl interprets the first backslash to mean that the second is a literal character. Tcl then produces n*w as the result of the XY command. The pattern matcher ultimately sees the four character string n*w. The pattern matcher interprets this in the usual way. The backslash indicates that the n is to be matched literally (which it would even without the backslash since the n is not special to the pattern matcher).
Source : Exploring Expect
The patterns that worked for me:
-exact {[1]:}
-exact "\[1]:"
{\[1]:}
"\\\[1]:"

TCL use elseif on new line

I like to structure if {} {} elseif {} {} in multiple lines when the statement block is rather short like below.
if {cond1} {do1}
elseif {cond2} {do2}
elseif {cond3} {do3}
But TCL doesn't let me do it. Gives : invalid command name "elseif"
It works when I open the braces around the 'do' statements into multiple line but that looks so ugly.
if {cond1} {do1
} elseif {cond2} {do2
} elseif {cond3} {do3}
What's the fundamental issue in TCL preventing it from recognizing an elseif on the next line after the if ?
Thanks,
Gert
A line break terminates the current command. To have a command continue on the next line, the newline character must be escaped or quoted.
If the newline is directly preceded by a backslash, the backslash, the newline and all tabs and spaces following in sequence will be replaced by a single space character.
if {cond1} {do1} \
elseif {cond2} {do2} \
elseif {cond3} {do3} \
else {do4}
If the newline is inside braces, it has no syntactic function. It is just another character in the string enclosed by braces and passed to the command. This is useful when you need to pass scripts consisting of several commands to e.g. an if command: the script will be re-interpreted within the command, and those newlines will resume their function there.
if {cond1} {do1
} elseif {cond2} {do2
} elseif {cond3} {do3
} else {do4}
Typical Tcl style is to write commands with script arguments like this:
if {cond1} {
do1
} elseif {cond2} {
do2
} elseif {cond3} {
do3
} else {
do4
}
This visual style isn't to everyone's taste, but one can get used to it.
Documentation: Tcl

How to check that the string is single word?

How to check that string is a single word?
Is this right way to do that?
set st "some string"
if { [llength $st] != 1 } {
puts "error"
}
According to one possible definition, you check if a string is one word by using:
catch {set oneWord 0;set oneWord [expr {[llength $string] == 0}]}
That's the Tcl language definition of a word.
On the other hand, if your preferred definition is “is alphanumeric” then you have other possibilities, such as:
# -strict excludes the empty string (normally included for historic reasons)
set oneWord [string is alnum -strict $string]
My answer is based on the assumption that a word contains only alphabet characters.
If you don't mind using some regexp, you can use this:
set st "some string"
if { ![regexp {^[A-Za-z]+$} $st] } {
puts "error"
}
[regexp expression string] returns 0 if there is no match and 1 is there is a match.
The expression I used is ^[A-Za-z]+$ which means the string starts with a letter and can contain any number of letters and must end with a letter. If you want to include a dash inside (e.g. co-operate is one word), you add it in the character class:
^[A-Za-z-]+$
If you are now worried about trailing spaces, I would suggest trimming it first before passing it to the regexp:
set st " some string "
if { ![regexp {^[A-Za-z]+$} [string trim $st]] } {
puts "error"
}
or if you want to directly use the regexp...
set st " some string "
if { ![regexp {^\s*[A-Za-z]+\s*$} $st] } {
puts "error"
}
EDIT: If a word is considered as a string of characters except space, you can do something else: check if the string contains a space.
set st "some strings"
if { [regexp { } $st] } {
puts "error"
}
If it finds a space, regexp will return 1.
regexp provides a straight forward way to match a word with \w and \W. \w matches a word character, while \W matches any character except a word character.
set st "some string"
if { [regexp {\W} $st] } {
puts "error"
}
However \w matches only digits, alphabets and _ (in any combination). If special characters are there in your word, this will not work.