How to parse txt file containing a repository of patterns

How to parse txt file containing a repository of patterns - tcl

I am new in scripting in TCL, I want to parse a txt file to create a list of patterns based on 2 strings as input.
My file looks like:
keyw1: data1
keyw1: data2
keyw1: Arg1
:
:
keyword: Pattern2Extract
{
some_lines
keyw1: Arg1
keyw2: patternP1
{
some_lines
}
keyw2: Arg2
{
some_lines
}
keyw2: patternP2
{
some_lines
}
.
.
some_others blocks of declaration between braces {}
.
.
}
keyword: Pattern2Extract
{
some_lines
keyw1: Arg1
keyw2: Arg2
{
some_lines
}
keyw2: patternP1
{
some_lines
}
keyw2: patternP2
{
some_lines
}
.
.
some_others blocks of declaration between braces {}
.
.
}
So, I would like to output 2 list of 'Pattern2Extract'
list1: if Arg1 is found in structure grouped between curly braces {}
list2: if arg1 and arg2 are both in structure grouped between curly braces {}
I have tried lsearch and lindex and it's working for list1 but I don't know how to do it for list2.
Here is my script:
proc inst_nm {inpFile outFile} {
set chanId [open $inpFile r]
set data [list]
while {[gets $chanId line] != -1} {
lappend data $line
}
close $chanId
foreach dt $data {
set MasDat [lindex $dt 0]
set pinDat [lindex $dt 1]
}
set intId [open "./filetoparse.txt" r]
set instDat [list]
while {[gets $intId line] != -1} {
lappend instDat $line
}
close $intId
set writeId [open $outFile a]
set MasterList [lreplace [lsearch -all $instDat *$MasDat*] 0 0]
foreach elem $MasterList {
set cellLn [lindex [split [lindex $instDat $elem ] ":"] 1]
set instName [lindex [split [lindex $instDat [expr $elem -5]] ":"] 1]
set PinLn [lindex [split [lindex $instDat [expr $elem +1]] ":"] 1]
foreach ele $PinLn {
if {"$ele"=="$pinDat" } {
puts $writeId "$instName $pinDat $cellLn"
} else {
puts $writeId "$instName $ele $cellLn"
}
}
}
close $writeId
}
inst_nm [lindex $::argv 0] [lindex $::argv 1]

Currently, inpFile may have many lines like $MastDat $pinDat and I need to collect instDat corresponding to each pair ($MastDat,$pinDat).
in file_to_parse by construction, we know that instName come in fifth line before $MastDat. However, we don't know the position of line conatining $pinDat declaration and this pattern could be present or not into instance section:
keyword: Pattern2Extract { some_lines keyw1: Arg1 keyw2: patternP1 { some_lines } keyw2: Arg2 { some_lines } keyw2: patternP2 { some_lines } . . some_others blocks of declaration between braces {} . . }
so, in list2 we should get all insName in where $pinDat is found
Thank you for your help

It helps to break out the code into another proc. In Tcl the proc must be declared ahead of when you call it. The data file didn't reflect your parser and also the MasterList might be removing the found item your looking for. Below is your parser broken up with example files that reflect what it's doing.
#!/usr/bin/tclsh
proc findPin {MasDat pinDat instDat} {
# set MasterList to the list of indexes found for *$MastDat*
set MasterList [lsearch -glob -all $instDat *$MasDat*]
set found [list]
# for each index number in MasterList
foreach elem $MasterList {
# n-5 (key: value(instName))
# n-4
# n-3
# n-2
# n-1
# n (key: value(cellLn)
# n+1 (key: value(PinLn)
set cellLn [lindex [split [lindex $instDat $elem ] ":"] 1]
set instName [lindex [split [lindex $instDat [expr $elem -5]] ":"] 1]
set PinLn [lindex [split [lindex $instDat [expr $elem +1]] ":"] 1]
foreach ele $PinLn {
if {"$ele"=="$pinDat" } {
lappend found "$instName $pinDat $cellLn"
}
}
}
return $found
}
proc inst_nm {inpFile outFile} {
# geta all lines in filestoparse.txt
set intId [open "./filetoparse.txt" r]
set instDat [list]
while {[gets $intId line] != -1} {
lappend instDat $line
}
close $intId
set writeId [open $outFile a]
# Search each line in inpFile
set chanId [open $inpFile r]
while {[gets $chanId line] != -1} {
set MasDat [lindex $line 0]
set pinDat [lindex $line 1]
foreach {item} [findPin $MasDat $pinDat $instDat] {
puts $writeId $item
}
}
close $chanId
close $writeId
}
inst_nm [lindex $::argv 0] [lindex $::argv 1]
filetoparse.txt
INST_NAME:MyInst
unknown-1
unknown-2
unknown-3
unknown-4
CELL_LN:MyCellLn
PIN_LN:pin1 pin2 pin3 pin4 pin5
unknown...
INST_NAME:TestInst
unknown-1
unknown-2
unknown-3
unknown-4
CELL_LN:TestCell
PIN_LN:test1 test2 test3
inputfile.txt
MyCellLn pin4
MyCellLn pin25
TestCell test1
TestCell test10
MyCellLn pin3
Output:
% ./keylist.tcl inputfile.txt keylist_found.txt
% cat keylist_found.txt
MyInst pin4 MyCellLn
TestInst test1 TestCell
MyInst pin3 MyCellLn

Actually, I'm interested just by printing '$instName' for each pair line from inpFile '$cellLn $pinDat'
filetoparse.txt:
INST_NAME:Inst1
{
4 unknown lines
CELL_LN: Cell1
other unkown lines
PIN_LN:pin1
unkown
PIN_LN:pin5
unknown...
}
INST_NAME:Inst2
{
4 unknown lines
CELL_LN: Cell1
other unkown lines
PIN_LN:pin3
unkown
PIN_LN:pin5
unknown...
}
INST_NAME:Inst3
{
4 unknown lines
CELL_LN: Cell2
other unkown lines
PIN_LN:pin2
unkown
PIN_LN:pin4
unknown...
}
INST_NAME:Inst4
{
4 unknown lines
CELL_LN: Cell2
other unkown lines
PIN_LN:pin5
unkown
PIN_LN:pin2
unknown...
}
inpFile.txt
cell1 pin1
cell2 pin2
So, I want in OutputFile have something like:
- for cell1 pin1:
list1: {Inst1 Inst2}
list2: {Inst1}
- for cell2 pin2:
list1: {Inst3 Inst4}
list2: {Inst3 Inst4}
Thank you for your help,

Related

Inserting single curly braces to Tcl list elements

I have a report file having multiple lines in this form:
str1 num1 num2 ... numN str2
Given that (N) is not the same across lines. These numbers represent coordinates, so I need to enclose each point with curly braces to be:
{num1 num2} {num3 num4} and so on...
I have tried this piece of code:
set file_r [open file.rpt r]
set lines [split [read $file_r] "\n"]
close $file_r
foreach line $lines {
set items [split $line]
set str1 [lindex $items 0]
set str2 [lindex $items [expr [llength $items] - 1]]
set box [lrange $items 1 [expr [llength $items] - 2]]
foreach coord $box {
set index [lsearch $box $coord]
set index_rem [expr $index % 2]
if {index_rem == 0} {
set box [lreplace $box $index $index "{$coord"]
} else {
set box [lreplace $box $index $index "$coord}"]
}
}
puts "box: $box"
}
This gives me a syntax error that a close-brace is missing. And if I try "\{$coord" the back-slash character gets typed in the $box.
Any ideas to overcome this?

There are a few things you could improve to have better and simpler Tcl style.
You usually don't need to use split to form a list from a line if the line is already space separated. Space separated strings can almost always be used directly in list commands.
The exceptions are when the string contains { or " characters.
lindex and lrange can take end and end-N arguments.
This plus Donal's comment to use lmap will result in this:
set file_r [open file.rpt r]
set lines [split [read $file_r] "\n"]
close $file_r
foreach line $lines {
set str1 [lindex $line 0]
set str2 [lindex $line end]
set numbers [lrange $line 1 end-1]
set boxes [lmap {a b} $numbers {list $a $b}]
foreach box $boxes {
puts "box: {$box}"
}
}

Parse a CSV file to TCL

I have a file as below:
a, b, c, d, e
S, 1.0, 100, F, fast
T, 2.0, 200, S, slow
First ROW is header only (a, b, c, d, e) and 2nd, 3rd row is the value (S, 1.0, 100, F, fast) correspond to the header.
I would like to read the file below into tcl and puts out the values (ie: row 2, column 5 -> fast)
I wrote the below script but doesnt seem to work:
proc game {name infile outfile} {
set csv [open $infile r]
set csv_lines [read $csv]
set out [open $outfile w]
set info [split $csv "\n"]
set infocount [llength $info]
set line 1
foreach line $info {
set values [split $line ","]
set firstline [lindex $values 0]
set secondline [lindex $values 1]
### HOW DO I PUTS OUT ROW2 COL5 or ROW1 COL3 ###
puts $outfile "$firstline"
}
close $infile
close $outfile
}
Want outfile to be as below:
a: S b: 1.0 c: 100 d: F e: fast
a: T b: 2.0 c: 200 d: S e: slow
or
a: T b: 2.0 c: 100 d: F e: slow
a: S b: 1.0 c: 200 d: F e: fast

Using the csv package from tcllib is the way to go for robustness, but on trivial data like this, split will work.
#!/usr/bin/env tclsh
proc game {name infile outfile} {
set in [open $infile r]
set out [open $outfile w]
set header [split [gets $in] ,]
while {[gets $in line] > 0} {
foreach col $header val [split $line ,] {
puts -nonewline $out "$col: $val "
}
puts $out ""
}
close $in
close $out
}
game foo input.csv output.txt

You might do:
package require csv
proc splitline {fh} {
if {[gets $fh line] != -1} {
set fields [csv::split $line]
return [lmap field $fields {string trimleft $field}]
}
}
proc transform {file} {
set fh [open $file r]
set head [splitline $fh]
while {[set fields [splitline $fh]] ne ""} {
puts [join [lmap h $head f $fields {string cat $h ":" $f}]]
}
close $fh
}
transform "file.csv"
a:S b:1.0 c:100 d:F e:fast
a:T b:2.0 c:200 d:S e:slow

You could use a dict to store the data of the csv file:
proc game {name inFile} {
upvar csv_data csv_data
set csv [open $inFile r]
set csv_lines [read $csv]
set row 0
foreach line [split $csv_lines "\n"] {
set values [split $line ","]
for {set col 0} {$col < [llength $values]} {incr col} {
dict set csv_data $row [expr {$col+1}] [string trim [lindex $values $col]]
}
incr row
}
close $csv
}
set csv_data {}
game foo input.csv
Now you can read from the dict like the below, where row 0 contains the headers, and col 1 is the one with a as header:
# To get row 2 col 5:
puts [dict get $csv_data 2 5]
# => slow
# To get row 1 col 3:
puts [dict get $csv_data 1 3]
# => 100
To print in the other format you asked, you'll need to do a little more work:
set outFile [open output.txt w]
for {set row 1} {$row < [llength [dict keys $csv_data]]} {incr row} {
set lineOut ""
foreach {- header} [dict get $csv_data 0] {- value} [dict get $csv_data $row] {
lappend lineOut "$header: $value"
}
puts $outFile [join $lineOut " "]
}
close $outFile
output.txt:
a: S b: 1.0 c: 100 d: F e: fast
a: T b: 2.0 c: 200 d: S e: slow

How to compare two lines in different files and output the same position in the other line in TCL?

I have two files and I want the output like below. Please help by providing me with a TCL script.
File1:
Name1: F * F F F
Name2: F F *
Name3: F F F F
File2:
Name1: AA, BB, CC, DD, EE,
Name2: AA, BB, CC,
Name3: AA, BB, CC, DD,
Output1:
Name1
AA - FAIL
BB - *
CC - FAIL
<cont>
Name2
AA - FAIL
BB - FAIL
CC - *
<cont>
Output2:
Name1
FAIL - AA CC DD EE
* - BB
Name2
FAIL - AA BB
* - CC
Name3
FAIL - AA BB CC DD

Try this following tested on tclsh8.5
set fd1 [open "input_file_1.txt" r]
set fd2 [open "input_file_2.txt" r]
set opfd [open "output_file.txt" w]
while {[gets $fd1 line] > 0 && [gets $fd2 line2] > 0} {
set line1 [split $line ":"]
set line2 [split $line2 ":"]
puts $opfd [lindex $line1 0]
set last_part_1 [string trim [lindex $line1 1] " "]
set last_part_2 [string trim [lindex $line2 1] " "]
set space_split [split $last_part_1 " "]
set comma_split [split $last_part_2 ","]
for {set i 0} {$i < [llength $space_split]} {incr i} {
puts $opfd "[string trim [lindex $comma_split $i] " "] = [string trim [lindex $space_split $i] " "]"
}
}
close $fd1
close $fd2
close $opfd
There will be file named as output_file.txt created inside current directory which contains your output.

Another way to do it:
package require fileutil
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
concat {*}[lmap item $rows {
split $item :
}]
}
set d1 [string map {F Fail} [getInput file1.txt]]
set d2 [string map {, {}} [getInput file2.txt]]
dict for {key values} $d1 {
puts $key
foreach v1 $values v2 [dict get $d2 $key] {
puts " $v2 - $v1"
}
}
This works by recognizing the dictionary-like structure of the data files. If every piece of data is a word without spaces, this version of getInput will coerce the contents of each file to a usable dict. From there, it's just a matter of replacing the F strings with Fail strings and removing the commas, and then doing dictionary iteration over either one of the dicts and pulling in the corresponding values from the other one.
If the values in the second file may contain spaces, getInput should look like this:
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
set res {}
foreach item $rows {
lassign [split $item :] key values
if {[string match *,* $values]} {
set values [split [string trimright $values {, }] ,]
}
lappend res $key $values
}
return $res
}
Documentation: concat, dict, foreach, if, lassign, lmap, lmap replacement, package, proc, puts, return, set, split, string

How to unset an arg in tcl

I am taking arguments from command line and passing all those arguments to another program (with expect spawn). I want to parse all options and omit some of them (or do something else). To do that I am doing this:
set arguments [lrange $argv 0 end]
#Check for -lp option. Set the program path
for {set var 0} {$var<$argc} {incr var} {
if {[lindex $arguments $var] == "-lp" || [lindex $arguments $var] == "--launcher-path"} {
if {[lindex $arguments [expr {$var+1}]] != ""} {
set program [lindex $arguments [expr {$var+1}]]
#unset [lindex $arguments $var]
} else {
puts "E: Argument missing for option: [lindex $arguments $var]"
exit 1
}
}
}
But I can't figure out how to unset those args that I used. For example, I need to unset [lindex $arguments [expr {$var+1}]] and [lindex $arguments $var].
This is how I am running the $program:
if {[catch {spawn $program --text {*}$arguments}]} {
puts "E: Launcher not found: $program"
exit 1
}

If your arguments are all key-value, then you can iterate over the arguments in pairs with foreach and build up a new list containing just the arguments you're interested in.
set newarguments [list]
foreach {arg value} $arguments {
switch -exact -- $arg {
"-lp" -
"--launcher-path" {
set program $value
}
default {
lappend newarguments $arg $value
}
}
}
If you have mixed flag and key-value options, then you will need to iterate using an index, similar to your code, but building up the new list of arguments will be roughly the same.
You could also check into the tcllib cmdline package, although that does not handle long options.

This is how I have done it:
set arguments [lreplace $arguments [expr {$var+1}] [expr {$var+1}]]
set arguments [lreplace $arguments $var $var]
As glenn-jackman pointed out, the above can be shortened to:
set arguments [lreplace $arguments $var [expr {$var+1}]]

How to parse a text file in tcl using separators?

I have a text file of the format
35|46
36|49
37|51
38|22
40|1
39|36
41|4
I have to read the file into an array across the separator "|" where left side will be the key of the array and right side will be the value.
I have used the following code
foreach {line} [split [read $lFile] \n] {
#puts $line
foreach {lStr} [split $line |] {
if { $lStr!="" } {
set lPartNumber [lindex $lStr 0]
set lNodeNumber [lindex $lStr 1]
set ::capPartsInterConnected::lMapPartNumberToNodeNumber($lPartNumber) $lNodeNumber
}
}
}
close $lFile
I am not able to read the left side of the separator "|". How to do it?
And similarly for this :
35|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
36|C:\AI\DESIGNS\SAMPLEDSN50\BENCH_WORKLIB.OLB|R
I need to assign all three strings in different variables

You are making mistake in the foreach where the result of split will be assigned to a loop variable lStr where it will contain only one value at a time causing the failure.
With lassign, this can be performed easily.
set fp [open input.txt r]
set data [split [read $fp] \n]
close $fp
foreach line $data {
if {$line eq {}} {
continue
}
lassign [split $line | ] key value
set result($key) $value
}
parray result
lassign [split "35|C:\\AI\\DESIGNS\\SAMPLEDSN50\\BENCH_WORKLIB.OLB|R" |] num userDir name
puts "num : $num"
puts "userDir : $userDir"
puts "name : $name"

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to parse txt file containing a repository of patterns - tcl

Related

Inserting single curly braces to Tcl list elements

Parse a CSV file to TCL

How to compare two lines in different files and output the same position in the other line in TCL?

How to unset an arg in tcl

How to parse a text file in tcl using separators?

Categories

Resources