How to parse a text file in tcl using separators? - tcl

I have a text file of the format
35|46
36|49
37|51
38|22
40|1
39|36
41|4
I have to read the file into an array across the separator "|" where left side will be the key of the array and right side will be the value.
I have used the following code
foreach {line} [split [read $lFile] \n] {
#puts $line
foreach {lStr} [split $line |] {
if { $lStr!="" } {
set lPartNumber [lindex $lStr 0]
set lNodeNumber [lindex $lStr 1]
set ::capPartsInterConnected::lMapPartNumberToNodeNumber($lPartNumber) $lNodeNumber
}
}
}
close $lFile
I am not able to read the left side of the separator "|". How to do it?
And similarly for this :
35|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
36|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
I need to assign all three strings in different variables

You are making mistake in the foreach where the result of split will be assigned to a loop variable lStr where it will contain only one value at a time causing the failure.
With lassign, this can be performed easily.
set fp [open input.txt r]
set data [split [read $fp] \n]
close $fp
foreach line $data {
if {$line eq {}} {
continue
}
lassign [split $line | ] key value
set result($key) $value
}
parray result
lassign [split "35|C:\\AI\\DESIGNS\\SAMPLEDSN50\\BENCH_WORKLIB.OLB|R" |] num userDir name
puts "num : $num"
puts "userDir : $userDir"
puts "name : $name"

Related

Inserting single curly braces to Tcl list elements

I have a report file having multiple lines in this form:
str1 num1 num2 ... numN str2
Given that (N) is not the same across lines. These numbers represent coordinates, so I need to enclose each point with curly braces to be:
{num1 num2} {num3 num4} and so on...
I have tried this piece of code:
set file_r [open file.rpt r]
set lines [split [read $file_r] "\n"]
close $file_r
foreach line $lines {
set items [split $line]
set str1 [lindex $items 0]
set str2 [lindex $items [expr [llength $items] - 1]]
set box [lrange $items 1 [expr [llength $items] - 2]]
foreach coord $box {
set index [lsearch $box $coord]
set index_rem [expr $index % 2]
if {index_rem == 0} {
set box [lreplace $box $index $index "{$coord"]
} else {
set box [lreplace $box $index $index "$coord}"]
}
}
puts "box: $box"
}
This gives me a syntax error that a close-brace is missing. And if I try "\{$coord" the back-slash character gets typed in the $box.
Any ideas to overcome this?
There are a few things you could improve to have better and simpler Tcl style.
You usually don't need to use split to form a list from a line if the line is already space separated. Space separated strings can almost always be used directly in list commands.
The exceptions are when the string contains { or " characters.
lindex and lrange can take end and end-N arguments.
This plus Donal's comment to use lmap will result in this:
set file_r [open file.rpt r]
set lines [split [read $file_r] "\n"]
close $file_r
foreach line $lines {
set str1 [lindex $line 0]
set str2 [lindex $line end]
set numbers [lrange $line 1 end-1]
set boxes [lmap {a b} $numbers {list $a $b}]
foreach box $boxes {
puts "box: {$box}"
}
}

Tcl: how to print one set

My file to be parsed is like this
Name : John
Pin : 5400
Age : 40
Place: Korea
Amount : 4000
Name : Peter
Pin : 6700
Age : 10
Place : Japan
Amount : 3600
My tcl code is
set start "Name"
set pn "Pin"
set ag "Age"
set ag_cutoff 15
set amnt "Amount"
foreach line [split $content "\n"] {
if {[regexp $start $line]} {
set count 1
set l1 $line
}
if {[regexp $pn $line] && $count ==1} {
set pin_val [lindex $line 2]
set l2 $line
}
if {[regexp $ag $line] && $count ==1} {
set ag [lindex $line 2]
if { $ag > $ag_cutoff} {
set rep_taken 1
set l3 $line
}
if {[regexp $amnt $line] && $count ==1 && $rep_taken == 1} {
set age_val [lindex $line 2]
puts $op1 "$ag $age_val "
puts $op2 "$l1\n$l2\n$l3\n"
}
This code is fine for plots.
However, I also want to o/p a file with complete set where $ag>$ag_cutoff.
Now with puts $op3 "$l1\n$l2\n$l3\n" ---> Able to print to a file. But how to print line Place which is not evaluated. Any better way to accomplish this.
Name : John
Pin : 5400
Age : 40
Place : Korea
Amount : 4000
It would be a lot simpler to let the parsing loop just create a dictionary (this replaces your code above):
set data {}
set count 0
foreach line [split $content \n] {
if {[lindex $line 0] eq "Name"} {
incr count
}
dict set data $count [lindex $line 0] [lindex $line 2]
}
This will blow up if the first line doesn't start with "Name", or if there is a missing blank between a colon and a word, and also if a value consists of several words. All of these are easy to fix.
Here, for instance, is an expanded version that takes care of the last two problems, should they occur:
set data {}
set count 0
foreach line [split $content \n] {
set keyword [string trimright [lindex $line 0] :]
set value [string trimleft [lrange $line 1 end] {: }]
if {$keyword eq "Name"} {
incr count
}
dict set data $count $keyword $value
}
When all records are stored, one can output selected records using dictionary iteration:
set ag_cutoff 15
dict for {count record} $data {
if {[dict get $record Age] > $ag_cutoff} {
dict for {k v} $record {
puts "$k : $v"
}
}
}
This also means that you can keep adding fields to the records, and the code will still work without change.
Precautions
If the data in content has empty lines at the beginning or end, or between some lines, these methods won't work. A simple way to guard against empty or blank lines at the beginning or the end is to replace
foreach line [split $content \n] {
with
foreach line [split [string trim $content] \n] {
If empty / blank lines may occur within the data, one can use this to skip them:
foreach line [split $content \n] {
if {[string is space $line]} continue
If one is 100% sure that all data is in proper list form, it is possible (but a bit code-smelly) to use list commands like lindex on it directly. If one is less sure, or if one wants to be more correct, one should convert each line to a list before working on it:
foreach line [split $content \n] {
set line [split $line]
Documentation: dict, foreach, if, incr, lindex, lrange, puts, set, split, string

How to read a string in tcl using split with the last character?

I am trying to read the following text printing each string before ;
0:1:2:3;
1:2:0;
10:13:15;
I wrote the following code
foreach {line} [split [read $lFile] \n] {
lassign [split $line ;] a
puts $a
}
But the output is the same string. I want the string before ;
In Tcl, semicolon marks the end of a command line, as such, you are actually doing split $line and not split $line ;. You will have to quote the ; for it to work:
foreach {line} [split [read $lFile] \n] {
lassign [split $line ";"] a
puts $a
}
Or using braces:
foreach {line} [split [read $lFile] \n] {
lassign [split $line {;}] a
puts $a
}
You could also use
set a [regsub {;.*} $a ""]
or, assuming no text after the semicolon
set a [string trimright $a ";"]
Output is the same string because you have a mistake in the foreach (as it was explained here). Though, you don't have to use foreach. You can read file line by line using while loop.
set file [open lFile.txt r];
while {![eof $file]} {
gets $file line;
lassign [split $line ";"] splittedFile;
puts stdout $splittedFile;
}
Or in other words, as long as file has not reached its end (![eof $file]), split file and print it to standard output.

splitting input line with varying formats in tcl with

Good afternoon,
I am attempting to write a tcl script which given the input file
input hreadyin;
input wire htrans;
input wire [7:0] haddr;
output logic [31:0] hrdata;
output hreadyout;
will produce
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout)
In other words, the format is:
<input/output> <wire/logic optional> <width, optional> <paramName>;
with the number of whitespaces unrestricted between each of them.
I have no problem reading from the input file and was able to put each line in a $line element. Now I have been trying things like:
set param0 [split $line "input"]
set param1 [lindex $param0 1]
But since not all lines have "input" line in them i am unable to get the elements i want (the name and the width if it exists).
Is there another command in tcl capable for doing this kind of parsing?
The regexp command is useful to find words separated by arbitrary whitespace:
while {[gets $fh line] != -1} {
# get all whitespace-separated words in the line, ignoring the semi-colon
set i [string first ";" $line]
set fields [regexp -inline -all {\S+} [string range $line 0 $i-1]]
switch -exact -- [llength $fields] {
2 - 3 {
set name [lindex $fields end]
puts [format "%s(%s)," $name $name]
}
4 {
lassign $fields - - width name
puts [format "%s(%s%s)," $name $name $width]
}
}
}
I think you should look at something like
# Compress all multiple spaces to single spaces
set compressedLine [resgub " +" $line " "]
set items [split [string range $compressedLine 0 end-1] $compressedLine " "]
switch [llength $items] {
2 {
# Handle case where neither wire/logic nor width is specificed
set inputOutput [lindex $items 0]
set paramName [lindex $items 1]
.
.
.
}
4 {
# Handle case where both wire/logic and width are specified
set inputOutput [lindex $items 0]
set wireLogic [lindex $items 1]
set width [lindex $items 2]
set paramName [lindex $items 3]
.
.
.
}
default {
# Don't know how to handle other cases - add them in if you know
puts stderr "Can't handle $line
}
}
I hope it's not legal to have exactly one of wire/logic and width specified - you'd need to work hard to determine which is which.
(Note the [string range...] fiddle to discard the semicolon at the end of the line)
Or if you can write up a regex that catches the right data, you can do this with this:
set data [open "file.txt" r]
set output [open "output.txt" w]
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
puts $output "$params\($params$width\),"
}
close $data
close $output
This one will also print the comma you have inserted in your expected output, but will insert it in the last line as well so you get:
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout),
If you don't want it and the file is not too large (apparently the limit is 2147483672 bytes for a list, which I'm gonna use), you could use a group like this:
set data [open "file.txt" r]
set output [open "output.txt" w]
set listing "" #Empty list
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
lappend listing "$params\($params$width\)" #Appending to list instead
}
puts $output [join $listing ",\n"] #Join all in a single go
close $data
close $output

How to get the data between two strings from a file in tcl?

In TCL Scripting:
I have a file in that i know how to search a string but how to get the line number when string is found.please answer me if it is possible
or
set fd [open test.txt r]
while {![eof $fd]} {
set buffer [read $fd]
}
set lines [split $buffer "\n"]
if {[regexp "S1 Application Protocol" $lines]} {
puts "string found"
} else {puts "not found"}
#puts $lines
#set i 0
#while {[regexp -start 0 "S1 Application Protocol" $line``s]==0} {incr i
#puts $i
#}
#puts [llength $lines]
#puts [lsearch -exact $buffer S1]
#puts [lrange $lines 261 320]
in the above program i am getting the output as string found .if i will give the string other than in this file i am getting string not found.
The concept of 'a line' is just a convention that we layer on top of the stream of data that we get from a file. So if you want to work with line numbers then you have to calculate them yourself. The gets command documnetion contains the following example:
set chan [open "some.file.txt"]
set lineNumber 0
while {[gets $chan line] >= 0} {
puts "[incr lineNumber]: $line"
}
close $chan
So you just need to replace the puts statement with your code to find the pattern of text you want to find and when you find it the value of $line gives you the line number.
To copy text that lies between two other lines I'd use something like the following
set chan [open "some.file.txt"]
set out [open "output.file.txt" "w"]
set lineNumber 0
# Read until we find the start pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "startpattern" $line]} {
# Now read until we find the stop pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "stoppattern" $line] } {
close $out
break
} else {
puts $out $line
}
}
}
}
close $chan
The easiest way is to use the fileutil::grep command:
package require fileutil
# Search for ipsum from test.txt
foreach match [fileutil::grep "ipsum" test.txt] {
# Each match is file:line:text
set match [split $match ":"]
set lineNumber [lindex $match 1]
set lineText [lindex $match 2]
# do something with lineNumber and lineText
puts "$lineNumber - $lineText"
}
Update
I realized that if the line contains colon, then lineText is truncated at the third colon. So, instead of:
set lineText [lindex $match 2]
we need:
set lineText [join [lrange $match 2 end] ":"]