splitting input line with varying formats in tcl with - tcl

Good afternoon,
I am attempting to write a tcl script which given the input file
input hreadyin;
input wire htrans;
input wire [7:0] haddr;
output logic [31:0] hrdata;
output hreadyout;
will produce
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout)
In other words, the format is:
<input/output> <wire/logic optional> <width, optional> <paramName>;
with the number of whitespaces unrestricted between each of them.
I have no problem reading from the input file and was able to put each line in a $line element. Now I have been trying things like:
set param0 [split $line "input"]
set param1 [lindex $param0 1]
But since not all lines have "input" line in them i am unable to get the elements i want (the name and the width if it exists).
Is there another command in tcl capable for doing this kind of parsing?

The regexp command is useful to find words separated by arbitrary whitespace:
while {[gets $fh line] != -1} {
# get all whitespace-separated words in the line, ignoring the semi-colon
set i [string first ";" $line]
set fields [regexp -inline -all {\S+} [string range $line 0 $i-1]]
switch -exact -- [llength $fields] {
2 - 3 {
set name [lindex $fields end]
puts [format "%s(%s)," $name $name]
}
4 {
lassign $fields - - width name
puts [format "%s(%s%s)," $name $name $width]
}
}
}

I think you should look at something like
# Compress all multiple spaces to single spaces
set compressedLine [resgub " +" $line " "]
set items [split [string range $compressedLine 0 end-1] $compressedLine " "]
switch [llength $items] {
2 {
# Handle case where neither wire/logic nor width is specificed
set inputOutput [lindex $items 0]
set paramName [lindex $items 1]
.
.
.
}
4 {
# Handle case where both wire/logic and width are specified
set inputOutput [lindex $items 0]
set wireLogic [lindex $items 1]
set width [lindex $items 2]
set paramName [lindex $items 3]
.
.
.
}
default {
# Don't know how to handle other cases - add them in if you know
puts stderr "Can't handle $line
}
}
I hope it's not legal to have exactly one of wire/logic and width specified - you'd need to work hard to determine which is which.
(Note the [string range...] fiddle to discard the semicolon at the end of the line)

Or if you can write up a regex that catches the right data, you can do this with this:
set data [open "file.txt" r]
set output [open "output.txt" w]
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
puts $output "$params\($params$width\),"
}
close $data
close $output
This one will also print the comma you have inserted in your expected output, but will insert it in the last line as well so you get:
hreadyin(hreadyin),
htrans(htrans),
haddr(haddr[7:0]),
hrdata(hrdata[31:0]),
hready(hreadyout),
If you don't want it and the file is not too large (apparently the limit is 2147483672 bytes for a list, which I'm gonna use), you could use a group like this:
set data [open "file.txt" r]
set output [open "output.txt" w]
set listing "" #Empty list
while {[gets $data line] != -1} {
regexp -- {(\[\d+:\d+\])?\s*(\w+);} $line - width params
lappend listing "$params\($params$width\)" #Appending to list instead
}
puts $output [join $listing ",\n"] #Join all in a single go
close $data
close $output

Related

Need to write specific columns in output file using tcl

I am trying to read a file with 5 columns( separated using space delimiter)
#text tag x y data_lay
bad bad1 10.0 10.0 L1
good goodn 13.0 11.0 L1
And trying to output the specific columns with a prefix on the first column in a new file. Output format should be like following
Add_obj bad 10.0 10.0 L1
Add_obj good 13.0 11.0 L1
I tried the following but has been unsuccessful in getting the anticipated output. Here, is the snippet of the code
set fp [open [lindex $argv 0] r]
set colData {}
while {[gets $fp line]>=0} {
if {[llength $line] ==4 } {
set colData [split $line “ “]
puts “Add_obj [lindex $colData 0] [lindex $colData 2] [lindex $colData 3] [lindex $colData 4]”
}
}
close $fp
Could you please help with a sample code?
Thanks.
There's no need to split $line by a space. As long as $line can be used as a proper list, then you can use lindex on $line.
I think you want to print only when llength is 5 (not 4).
I noticed in your sample code that there are non-ascii double quotes “ and ”. You need to have regular double quotes ".
set fp [open a.txt]
while {[gets $fp line]>=0} {
if {[llength $line] == 5 } {
# Skip header?
if {[string match "#*" $line]} {
continue
}
puts "Add_obj [lindex $line 0] [lindex $line 2] [lindex $line 3] [lindex $line 4]"
}
}
close $fp
You might want to also print a formatted string, prepared with the format command.

Tcl: how to print one set

My file to be parsed is like this
Name : John
Pin : 5400
Age : 40
Place: Korea
Amount : 4000
Name : Peter
Pin : 6700
Age : 10
Place : Japan
Amount : 3600
My tcl code is
set start "Name"
set pn "Pin"
set ag "Age"
set ag_cutoff 15
set amnt "Amount"
foreach line [split $content "\n"] {
if {[regexp $start $line]} {
set count 1
set l1 $line
}
if {[regexp $pn $line] && $count ==1} {
set pin_val [lindex $line 2]
set l2 $line
}
if {[regexp $ag $line] && $count ==1} {
set ag [lindex $line 2]
if { $ag > $ag_cutoff} {
set rep_taken 1
set l3 $line
}
if {[regexp $amnt $line] && $count ==1 && $rep_taken == 1} {
set age_val [lindex $line 2]
puts $op1 "$ag $age_val "
puts $op2 "$l1\n$l2\n$l3\n"
}
This code is fine for plots.
However, I also want to o/p a file with complete set where $ag>$ag_cutoff.
Now with puts $op3 "$l1\n$l2\n$l3\n" ---> Able to print to a file. But how to print line Place which is not evaluated. Any better way to accomplish this.
Name : John
Pin : 5400
Age : 40
Place : Korea
Amount : 4000
It would be a lot simpler to let the parsing loop just create a dictionary (this replaces your code above):
set data {}
set count 0
foreach line [split $content \n] {
if {[lindex $line 0] eq "Name"} {
incr count
}
dict set data $count [lindex $line 0] [lindex $line 2]
}
This will blow up if the first line doesn't start with "Name", or if there is a missing blank between a colon and a word, and also if a value consists of several words. All of these are easy to fix.
Here, for instance, is an expanded version that takes care of the last two problems, should they occur:
set data {}
set count 0
foreach line [split $content \n] {
set keyword [string trimright [lindex $line 0] :]
set value [string trimleft [lrange $line 1 end] {: }]
if {$keyword eq "Name"} {
incr count
}
dict set data $count $keyword $value
}
When all records are stored, one can output selected records using dictionary iteration:
set ag_cutoff 15
dict for {count record} $data {
if {[dict get $record Age] > $ag_cutoff} {
dict for {k v} $record {
puts "$k : $v"
}
}
}
This also means that you can keep adding fields to the records, and the code will still work without change.
Precautions
If the data in content has empty lines at the beginning or end, or between some lines, these methods won't work. A simple way to guard against empty or blank lines at the beginning or the end is to replace
foreach line [split $content \n] {
with
foreach line [split [string trim $content] \n] {
If empty / blank lines may occur within the data, one can use this to skip them:
foreach line [split $content \n] {
if {[string is space $line]} continue
If one is 100% sure that all data is in proper list form, it is possible (but a bit code-smelly) to use list commands like lindex on it directly. If one is less sure, or if one wants to be more correct, one should convert each line to a list before working on it:
foreach line [split $content \n] {
set line [split $line]
Documentation: dict, foreach, if, incr, lindex, lrange, puts, set, split, string

How to parse a text file in tcl using separators?

I have a text file of the format
35|46
36|49
37|51
38|22
40|1
39|36
41|4
I have to read the file into an array across the separator "|" where left side will be the key of the array and right side will be the value.
I have used the following code
foreach {line} [split [read $lFile] \n] {
#puts $line
foreach {lStr} [split $line |] {
if { $lStr!="" } {
set lPartNumber [lindex $lStr 0]
set lNodeNumber [lindex $lStr 1]
set ::capPartsInterConnected::lMapPartNumberToNodeNumber($lPartNumber) $lNodeNumber
}
}
}
close $lFile
I am not able to read the left side of the separator "|". How to do it?
And similarly for this :
35|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
36|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
I need to assign all three strings in different variables
You are making mistake in the foreach where the result of split will be assigned to a loop variable lStr where it will contain only one value at a time causing the failure.
With lassign, this can be performed easily.
set fp [open input.txt r]
set data [split [read $fp] \n]
close $fp
foreach line $data {
if {$line eq {}} {
continue
}
lassign [split $line | ] key value
set result($key) $value
}
parray result
lassign [split "35|C:\\AI\\DESIGNS\\SAMPLEDSN50\\BENCH_WORKLIB.OLB|R" |] num userDir name
puts "num : $num"
puts "userDir : $userDir"
puts "name : $name"

TCL String Manipulation and Extraction

I have a string xxxxxxx-s12345ab7_0_0_xx2.log and need to have an output like AB700_xx2 in TCL.
ab will be the delimiter and need to extract from ab to . (including ab) and also have to remove only the first two underscores.
Tried string trim, string trimleft and string trimright, but not much use. Is there anything like string split in TCL?
The first stage is to extract the basic relevant substring; the easiest way to do that is actually with a regular expression:
set inputString "xxxxxxx-s12345ab7_0_0_xx2.log"
if {![regexp {ab[^.]+} $inputString extracted]} {
error "didn't match!"
}
puts "got $extracted"
# ===> got ab7_0_0_xx2
Then, we want to get rid of those nasty underscores with string map:
set final [string map {"_" ""} $extracted]
puts "got $final"
# ===> ab700xx2
Hmm, not quite what we wanted! We wanted to keep the last underscore and to up-case the first part.
set pieces [split $extracted "_"]
set final [string toupper [join [lrange $pieces 0 2] ""]]_[join [lrange $pieces 3 end] "_"]
puts "got $final"
# ===> got AB700_xx2
(The split command divides a string up into “records” by an optional record specifier — which defaults to any whitespace character — that we can then manipulate easily with list operations. The join command does the reverse, but here I'm using an empty record specifier on one half which makes everything be concatenated. I think you can guess what the string toupper and lrange commands do…)
set a "xxxxxxx-s12345ab7_0_0_xx2.log"
set a [split $a ""]
set trig 0
set extract ""
for {set i 0} {$i < [llength $a]} {incr i} {
if {"ab" eq "[lindex $a $i][lindex $a [expr $i+1]]"} {
set trig 1
}
if {$trig == 1} {
append extract [lindex $a $i]
}
}
set extract "[string toupper [join [lrange [split [lindex [split $extract .] 0] _] 0 end-1] ""]]_[lindex [split [lindex [split $extract .] 0] _] end]"
puts $extract
Only regexp is enough to do the trick.
Set string "xxxxxxx-s12345ab7_0_0_xx2.log"
regexp {(ab)(.*)_(.*)_(.*)_(.*)\\.} $string -> s1 s2 s3 s4 s5
Set rstring "$s1$s2$s3$s4\_$s5"
Puts $rstring

How to get the data between two strings from a file in tcl?

In TCL Scripting:
I have a file in that i know how to search a string but how to get the line number when string is found.please answer me if it is possible
or
set fd [open test.txt r]
while {![eof $fd]} {
set buffer [read $fd]
}
set lines [split $buffer "\n"]
if {[regexp "S1 Application Protocol" $lines]} {
puts "string found"
} else {puts "not found"}
#puts $lines
#set i 0
#while {[regexp -start 0 "S1 Application Protocol" $line``s]==0} {incr i
#puts $i
#}
#puts [llength $lines]
#puts [lsearch -exact $buffer S1]
#puts [lrange $lines 261 320]
in the above program i am getting the output as string found .if i will give the string other than in this file i am getting string not found.
The concept of 'a line' is just a convention that we layer on top of the stream of data that we get from a file. So if you want to work with line numbers then you have to calculate them yourself. The gets command documnetion contains the following example:
set chan [open "some.file.txt"]
set lineNumber 0
while {[gets $chan line] >= 0} {
puts "[incr lineNumber]: $line"
}
close $chan
So you just need to replace the puts statement with your code to find the pattern of text you want to find and when you find it the value of $line gives you the line number.
To copy text that lies between two other lines I'd use something like the following
set chan [open "some.file.txt"]
set out [open "output.file.txt" "w"]
set lineNumber 0
# Read until we find the start pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "startpattern" $line]} {
# Now read until we find the stop pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "stoppattern" $line] } {
close $out
break
} else {
puts $out $line
}
}
}
}
close $chan
The easiest way is to use the fileutil::grep command:
package require fileutil
# Search for ipsum from test.txt
foreach match [fileutil::grep "ipsum" test.txt] {
# Each match is file:line:text
set match [split $match ":"]
set lineNumber [lindex $match 1]
set lineText [lindex $match 2]
# do something with lineNumber and lineText
puts "$lineNumber - $lineText"
}
Update
I realized that if the line contains colon, then lineText is truncated at the third colon. So, instead of:
set lineText [lindex $match 2]
we need:
set lineText [join [lrange $match 2 end] ":"]