Ignore white space after reading csv file tool command line code - tcl

I am new to tcl programming, I am reading a csv file however the rows can contain a space. But tcl splits on spaces. how to ignore that default behavior.
my csv is
1,fname,lname 1
2,fname,lname 2
The split works, when I try to output [lindex ${line} 2] I was expecting lname 1. However since tcl splits on spaces how to I overcome that issue.
foreach row $data {
set line [split ${row} ","]
puts [lindex ${line} 0]
}

You almost have the answer right there. When doing simple CSV reading, you first split by newline to get the records, and then split by comma to get the fields in a record.
foreach row [split $data "\n"] {
set line [split $row ","]
puts [lindex $line 0]
}
In the complex case (once you start having fields with embedded commas and newlines and so on) you use the csv package from Tcllib, as that handles the nuances for you. In particular csv::read2matrix is helpful.
And if you can, use a character other than comma to separate fields. Tabs are a common recommended choice; that makes tab-separated files, and that's very commonly supported and usually has trouble-free interoperation.

Related

change a number in txt file

I have the next expression: jj_ftfll h\\h\ -0.8898:0.006656 0.998:0.99999 h&j\hhh in a txt file,
and I need to add 0.005 to the 0.006656 number. I want to use Tcl and I can't think of any good idea.
There's several aspects that are tricky.
The file needs to be edited in-place despite the fact that the addition might change the length of the line. (Such an addition could potentially either lengthen or shorten the line.)
There needs to be a way of robustly recognising that that is the line to modify, and not some other line in the file. (This is actually the hardest of these problems in reality; it's extremely application-specific.)
The number needs to be extracted from the line, modified, and written back.
The values you are dealing with are potentially (well, actually) not represented precisely in IEEE binary floating point, which is what Tcl will use to do the calculations.
Bearing all that in mind, we are talking about these sorts of solutions:
We'll read the whole file in, split it into a list of strings, one string per line (henceforth referred to as the lines), update the lines of interest, and then write the whole lot back.
We'll use regexp to decide if a line is of interest. That's by far the most common command for this sort of task.
This one is messy in Tcl 8.6 and before. It's got a much better solution in Tcl 8.7.
There's really not all that much you can do about this. If you know the range of the numbers, you can use format to help… but it's messy. But maybe you'll get lucky.
set filename "foobar.txt"
# Get the lines of the file; this is GREAT if the file isn't too large
set f [open $filename]
set lines [split [read $f] "\n"]
close $f
# Now THAT'S what I call a horrible regular expression!
set RE {^(jj_ftfll\s+h\\\\h\\\s+-?[\d.]+:)(-?[\d.]+)(\s+-?[\d.]+:-?[\d.]+\s+h&j\hhh)$}
set newLines {}
foreach line $lines {
if {[regexp $RE $line -> prefix number suffix]} {
set line $prefix[expr {$number + 0.005}]$suffix
}
lappend newLines $line
}
# Write back over the file; the -nonewline prevents the number of lines from growing
set f [open $filename w]
puts -nonewline $f [join $newLines "\n"]
close $f
The trick with the regexp is that I am matching three pieces: the bit of the line before the part to replace (saved in the variable prefix), the number to replace itself (number), and the bit after the part to replace (suffix); the regexp command returns the number of times it matches (1 if the RE is found, 0 if it isn't). It's a scary RE mostly because it has -?[\d.]+ to match those floating point numbers, and I've changed spaces to \s+ (i.e., “at least one whitespace character”).
The version for 8.7 is this:
set filename "foobar.txt"
# Get the lines of the file; this is GREAT if the file isn't too large
set f [open $filename]
set lines [split [read $f] "\n"]
close $f
# Now THAT'S what I call a horrible regular expression!
set RE {^(jj_ftfll\s+h\\\\h\\\s+-?[\d.]+:)(-?[\d.]+)(\s+-?[\d.]+:-?[\d.]+\s+h&j\hhh)$}
proc addDeltaInLine {delta prefix number suffix} {
set number [expr {$number + $delta}]
return [string cat $prefix $number $suffix]
}
set newLines [lmap line $lines {
regsub -command $RE $line {addDeltaInLine 0.005}
}]
# Write back over the file; the -nonewline prevents the number of lines from growing
set f [open $filename w]
puts -nonewline $f [join $newLines "\n"]
close $f
The combination of lmap and regsub -command clean things up quite a bit. The RE is still scary though…

Need help extracting specific lines from a changing logfile using expect

I'm trying to use an expect script to access a remote device via telnet, read/save the remote "EVENTLOG" locally, and then extract specific lines (serial numbers) from the log file. Problem is the log files are constantly changing so I need a way to search for specific strings. The remote device is Linux based, but doesn't have things like grep, vi, less, etc as it's QNX Neutrino, hence having to do it locally.
I've successfully gotten the telnet, read the file and save locally under control, but when I get to "reading" the file is when I have issues. Currently I'm just trying to get it to print what it found, but the script just exits without reporting anything except some extra braces??
#!/usr/bin/expect -f
set timeout -1
log_user 1
spawn telnet $IP
match_max 100000
expect "login:"
send -- "$USER\r"
expect "Password:"
send -- "$PW\r"
expect "# "
send -- "\r"
#at this point logged into device
#send command to generate the "dallaslog"
set dallaslog [open dallaslog.txt w]
expect "#"
send -- "cat `ls -rt /LOG/event*`\r"
expect "(cat) exited status=0"
set logout $expect_out(buffer)
puts $dallaslog "$logout"
close $dallaslog
unset expect_out(buffer)
set dallasread [open dallaslog.txt r]
set lines [split [read $dallasread] "\r"]
close $dallasread
puts "${green}$lines{$normal}"
#a debug line to print $dallasread in green so I can verify it works up to here
foreach line $lines {
if {[regexp {.*Dallas ID: 0.*\n} $lines match]} {
if {$match == 1} {
puts $line ;# Prints whole line which has 1 at end
}
}
}
expect "# "
send -- "exit\r"
interact
What I'm (eventually) looking for is the script to catch any line starting with "Dallas ID:" and then to save that information to a variable, so I can use the "scan" command to parse the line and extract information.
What I get is:
(the results from $lines being "puts" in green)
"...
<ENTRY TIME="01/01/1970 00:48:07" PROC="syncd" FILE="mips.cc" LINE="208" NUM="10000">
UTC step from 01/01/1970 00:48:08 to 01/01/1970 00:48:07
</ENTRY>
Process 3174431 (cat) exited status=0
}{}
# exit
Process 3162142 (sh) exited status=0.
Connection closed by foreign host."
Thank you in advance for all the help. I'm a newbie to TCL/expect (been toying with it since last July) but I'm finding it to be a pretty powerful tool, just hard for me to debug!
EDIT: Added more information per #meuh 's reponse.
Example: There can be up to 4 Dallas ID, but generally I only have 0 and 1. Goal is to get the SN, BC, CN for reach Dallas ID saved as variables to put in a separate text file.
<ENTRY TIME="01/01/1970 00:00:06" PROC="sys" FILE="PlatformUtils.cpp" LINE="1227" NUM="10044">
Dallas ID: 1 SN:00000622393A BC: J4AD945 CN: IS200BPPBH2BMD R0: 001C
</ENTRY>
The foreach loop I used was an example from an old question on stack overflow I tried to modify to use here, unsuccessfully.
EDIT: I should also probably mention that this event log is approximately 800 lines long every time it gets read, which is why I haven't posted an excerpt from it.
This regexp line is probably not doing what you want:
if {[regexp {.*Dallas ID: 0.*\n} $lines match]} {
if {$match == 1} {
puts $line
You are passing the list $lines instead of, presumably, the single line $line. The variable match will be set to the string that matched which must therefore include the words "Dallas" and so on, so it can never be 1.
Your code comment says Prints whole line which has 1 at end, but I'm not sure what you are looking for as you do not have any example data that fits the regexp.
If you choose your regexp pattern using grouping you could capture parts of the line so perhaps not need a further scan. Eg
regexp {PROC="([a-z]*)"} $line match submatch
would set variable submatch to syncd in your above example.
You may also have a fundamental problem caused by tcl's handling of \r\n on input from a file. The lines you got from $expect_out(buffer) do indeed have the 2 characters as end-of-line delimiters. However,
when using read, by default I believe, it will translate the same sequence to a normalised \n. So your split will not do anything, and you need to split on \n rather than \r. You can check the size of the list of lines you have with
puts [llength $lines]
If it is 1, then your split is not working. Replace it with
set lines [split [read $dallasread] "\n"]
This should help your loop, where for example you can try
foreach line $lines {
if {[regexp {.*Dallas ID: (\d+) SN:([^ ]+)} $line match idnum SN]} {
puts $line
puts "$idnum, $SN"
}
}
You must remove the \n at the end of your regexp, as this is no longer present after the split. I've extended the regexp example with (\d+) to match for the id number (\d matches a digit), and ([^ ]+) to match any number of non-space characters after the text SN:.
These values are captured by the use of () grouping, and are placed in the variables idnum and SN, which you should be able to see output by the second puts command.

To find line index and word index by reading a text file

I have just started learning Tcl, can someone help me how to find line index and word index for a particular word by reading a text file using Tcl.
Thank you
As mentioned in the comments, there is a lot of basic commands you might utilize to solve your problem. To read a file into a list of lines you could use open, split, read and close commands as follows:
set file_name "x.txt"
# Open a file in a read mode
set handle [open $file_name r]
# Create a list of lines
set lines [split [read $handle] "\n"]
close $handle
Finding a certain word in a list of lines might be achieved by using a for loop, incr and a set of lists related commands like llength, lindex and lsearch. Every string in Tcl can be interpreted and processed as a list. The implementation might look like this:
# Searching for a word "word"
set neddle "word"
set w -1
# For each line (you can use `foreach` command here)
for {set l 0} {$l < [llength $lines]} {incr l} {
# Treat a line as a list and search for a word
if {[set w [lsearch [lindex $lines $l] $neddle]] != -1} {
# Exit the loop if you found the word
break
}
}
if {$w != -1} {
puts "Word '$neddle' found. Line index is $l. Word index is $w."
} else {
puts "Word '$neddle' not found."
}
Here, the script iterates over the lines and searches each one for a given word as if it was a list. Executing a list command on a string splits it by space by default. The loop stops when a word is found in a line (when lsearch returns a non-negative index).
Also note, that the list commands are treating multiple spaces as a single separator. In this case it seems to be a desired behavior. Using split command on a string with a double space would effectively create a "zero length word" which might yield an incorrect word index.

collecting set of files using tcl script

I am looking to generate a tcl script, which reads each line of a file, say abc.txt; each line of abc.txt is a specific location of set of files which need to be picked except the ones commented.
For example abc.txt has
./pvr.vhd
./pvr1.vhd
// ./pvr2.vhd
So I need to read each line of abc.txt and pick the file from the location it has mentioned and store it in a separate file except the once which starts with "//"
Any hint or script will be deeply appreciated.
The usual way of doing this is to put a filter at the start of the loop that processes each line that causes the commented lines to be skipped. You can use string match to do the actual detecting of whether a line is to be filtered.
set f [open "abc.txt"]
set lines [split [read $f] "\n"]
close $f
foreach line $lines {
if {[string match "//*" $line]} {
continue
}
# ... do your existing processing here ...
}
This also works just as well when used with a streaming loop (while {[gets $f line] >= 0} {…}).

Grep the word inside double quote

How can I extract a word inside a double quote inside a file?
e.g.
variable "xxx"
Reading a text file into Tcl is just this:
set fd [open $filename]
set data [read $fd] ;# Now $data is the entire contents of the file
close $fd
To get the first quoted string (under some assumptions, notably a lack backslashed double quote characters inside the double quotes), use this:
if {[regexp {"([^""]*)"} $data -> substring]} {
# We found one, it's now in $substring
}
(Doubling up the quote in the brackets is totally unnecessary — only one is needed — but it does mean that the highlighter does the right thing here.)
The simplest method of finding all the quoted strings is this:
foreach {- substring} [regexp -inline -all {"([^""]*)"} $data] {
# One of the substrings is $substring at this point
}
Notice that I'm using the same regular expression in each case. Indeed, it's actually good practice to factor such REs (especially if repeatedly used) into a variable of their own so that you can “name” them.
Combining all that stuff above:
set FindQuoted {"([^""]*)"}
set fd [open $filename]
foreach {- substring} [regexp -inline -all $FindQuoted [read $fd]] {
puts "I have found $substring for you"
}
close $fd
Internal Matching
If you're just looking for a regular expression, then you can use TCL's capture groups. For example:
set string {variable "xxx"}
regexp {"(.*)"} $string match group1
puts $group1
This will return xxx, discarding the quotes.
External Matching
If you want to match data in a file without having to handling reading the file into TCL directly, you can do that too. For example:
set match [exec sed {s/^variable "\(...\)"/\1/} /tmp/foo]
This will call sed to find just the parts of the match you want, and assign them to a TCL variable for further process. In this example, the match variable is set to xxx as above, but is operating on an external file rather than a stored string.
When you just want to find with grep all words in quotes in a file and do something with the words, you do something like this (in a shell):
grep -o '"[^"]*"' | while read word
do
# do something with $word
echo extracted: $word
done