Parse a CSV file to TCL - csv

I have a file as below:
a, b, c, d, e
S, 1.0, 100, F, fast
T, 2.0, 200, S, slow
First ROW is header only (a, b, c, d, e) and 2nd, 3rd row is the value (S, 1.0, 100, F, fast) correspond to the header.
I would like to read the file below into tcl and puts out the values (ie: row 2, column 5 -> fast)
I wrote the below script but doesnt seem to work:
proc game {name infile outfile} {
set csv [open $infile r]
set csv_lines [read $csv]
set out [open $outfile w]
set info [split $csv "\n"]
set infocount [llength $info]
set line 1
foreach line $info {
set values [split $line ","]
set firstline [lindex $values 0]
set secondline [lindex $values 1]
### HOW DO I PUTS OUT ROW2 COL5 or ROW1 COL3 ###
puts $outfile "$firstline"
}
close $infile
close $outfile
}
Want outfile to be as below:
a: S b: 1.0 c: 100 d: F e: fast
a: T b: 2.0 c: 200 d: S e: slow
or
a: T b: 2.0 c: 100 d: F e: slow
a: S b: 1.0 c: 200 d: F e: fast

Using the csv package from tcllib is the way to go for robustness, but on trivial data like this, split will work.
#!/usr/bin/env tclsh
proc game {name infile outfile} {
set in [open $infile r]
set out [open $outfile w]
set header [split [gets $in] ,]
while {[gets $in line] > 0} {
foreach col $header val [split $line ,] {
puts -nonewline $out "$col: $val "
}
puts $out ""
}
close $in
close $out
}
game foo input.csv output.txt

You might do:
package require csv
proc splitline {fh} {
if {[gets $fh line] != -1} {
set fields [csv::split $line]
return [lmap field $fields {string trimleft $field}]
}
}
proc transform {file} {
set fh [open $file r]
set head [splitline $fh]
while {[set fields [splitline $fh]] ne ""} {
puts [join [lmap h $head f $fields {string cat $h ":" $f}]]
}
close $fh
}
transform "file.csv"
a:S b:1.0 c:100 d:F e:fast
a:T b:2.0 c:200 d:S e:slow

You could use a dict to store the data of the csv file:
proc game {name inFile} {
upvar csv_data csv_data
set csv [open $inFile r]
set csv_lines [read $csv]
set row 0
foreach line [split $csv_lines "\n"] {
set values [split $line ","]
for {set col 0} {$col < [llength $values]} {incr col} {
dict set csv_data $row [expr {$col+1}] [string trim [lindex $values $col]]
}
incr row
}
close $csv
}
set csv_data {}
game foo input.csv
Now you can read from the dict like the below, where row 0 contains the headers, and col 1 is the one with a as header:
# To get row 2 col 5:
puts [dict get $csv_data 2 5]
# => slow
# To get row 1 col 3:
puts [dict get $csv_data 1 3]
# => 100
To print in the other format you asked, you'll need to do a little more work:
set outFile [open output.txt w]
for {set row 1} {$row < [llength [dict keys $csv_data]]} {incr row} {
set lineOut ""
foreach {- header} [dict get $csv_data 0] {- value} [dict get $csv_data $row] {
lappend lineOut "$header: $value"
}
puts $outFile [join $lineOut " "]
}
close $outFile
output.txt:
a: S b: 1.0 c: 100 d: F e: fast
a: T b: 2.0 c: 200 d: S e: slow

Related

tcl how to read files and show the certain words

I have a question few days ago ,but I think my expression is not clear and I separate my question into many small questions.
I have many files of process and it contain versions, I have regexp certain line of them and import them into a txt file , the txt format is like
#process #AA_version #BB_version
a11 Aa/10.10-d87_1 Bb/10.57-d21_1
a15 Aa/10.15-d37_1 Bb/10.57-d28_1
a23 Aa/10.20-d51_1 Bb/10.57-d29_3
and each process correspond its AA_version and BB_version
I want to write a tcl named get_tool_version.tcl to show /modify(not replace) the content
If I tclsh get_tool_version.tcl and input process and it will read the txt file and show it's
AA_version=Aa/
BB_version=Bb/
and then I can modify the string of AA and BB version
there is my code
set fp [open tool_version r+]
set file_data [read $fp]
close $fp
set data [split $file_data "\n"]
#input the process
set name [gets stdin] ->#and it'll show correspond AAand BB version
but I don't know how to show it's AA_version and BB_version
and how to modify them.
Or I need to use array?
thanks
Here's a way:
set fh [open tool_version r]
set data [dict create]
while {[gets $fh line] != -1} {
regexp {(\w+)\s+Aa/(\S+)\s+Bb/(\S+)} $line -> process aa bb
dict set data $process Aa $aa
dict set data $process Bb $bb
}
close $fh
set name a15 ;# you would get input from user here
puts "process = $name; Aa = [dict get $data $name Aa]; Bb = [dict get $data $name Bb]"
process = a15; Aa = 10.15-d37_1; Bb = 10.57-d28_1
The Tcl regex syntax is here: https://www.tcl-lang.org/man/tcl8.6/TclCmd/re_syntax.htm
here's my final version
set fp [open tool_version r]
set process [gets stdin]
while {[gets $fh line] != -1} {
if (regexp $process $line) {
dict set process1 Aa: [lindex $line 1]
dict set process1 Bb: [lindex $line 2]
puts "Aa: [lindex $line 1]"
puts "Bb: [lindex $line 2]"
}
}
close $fp
Thanks~

Compare columns between 2 files using TCL

I have 2 files having only one column. Say file1.txt and file2.txt.
Below are the contents inside the file
Inside file1.txt
Tom
Harry
Snowy
Edward
Inside file2.txt
Harry
Tom
Edward
2) I want to write a code that will check each item in the column and print something as below.
"Tom, Harry, Edward" are present in both the files
Snowy is there in file1.txt but not in file2.txt
3) Basic code
set a [open file1.txt r]
set b [open file2.txt r]
while {[gets $a line1] >= 0 && [gets $b line2] >= 0} {
foreach a_line $line1 {
foreach b_line $line2 {
if {$a_line == $b_line } {
puts "$a_line in file test1 is present in $b_line in file test2\n"
} else {
puts "$a_line is not there\n"
}
}
}
}
close $a
close $b
Issue is that it is not checking each name in the column.
Any suggestions.
Thanks in advance.
Neel
What you want to do is read each file separately and not have nested loops:
# read the contents of file1 into an associative array
# store the user as an array **key** for fast lookoup
set fh [open "file1.txt" r]
while {[gets $fh user] != -1} {
set f1tmp($user) ""
}
close $fh
# read file2 and compare against file1
array set users {both {} file1 {} file2 {}}
set fh [open "file2.txt" r]
while {[gets $fh user] != -1} {
if {[info exists f1tmp($user)]} {
lappend users(both) $user
unset f1tmp($user)
} else {
lappend users(file2) $user
}
}
close $fh
set users(file1) [array names f1tmp]
parray users
users(both) = Harry Tom Edward
users(file1) = Snowy
users(file2) =
Or as Donal suggests, use tcllib
package require struct::set
set fh [open file1.txt r]
set f1users [split [read -nonewline $fh] \n]
close $fh
set fh [open file2.txt r]
set f2users [split [read -nonewline $fh] \n]
close $fh
set results [struct::set intersect3 $f1users $f2users]
puts "in both: [join [lindex $results 0] ,]"
puts "f1 only: [join [lindex $results 1] ,]"
puts "f2 only: [join [lindex $results 2] ,]"
in both: Harry,Tom,Edward
f1 only: Snowy
f2 only:

How to compare two lines in different files and output the same position in the other line in TCL?

I have two files and I want the output like below. Please help by providing me with a TCL script.
File1:
Name1: F * F F F
Name2: F F *
Name3: F F F F
File2:
Name1: AA, BB, CC, DD, EE,
Name2: AA, BB, CC,
Name3: AA, BB, CC, DD,
Output1:
Name1
AA - FAIL
BB - *
CC - FAIL
<cont>
Name2
AA - FAIL
BB - FAIL
CC - *
<cont>
Output2:
Name1
FAIL - AA CC DD EE
* - BB
Name2
FAIL - AA BB
* - CC
Name3
FAIL - AA BB CC DD
Try this following tested on tclsh8.5
set fd1 [open "input_file_1.txt" r]
set fd2 [open "input_file_2.txt" r]
set opfd [open "output_file.txt" w]
while {[gets $fd1 line] > 0 && [gets $fd2 line2] > 0} {
set line1 [split $line ":"]
set line2 [split $line2 ":"]
puts $opfd [lindex $line1 0]
set last_part_1 [string trim [lindex $line1 1] " "]
set last_part_2 [string trim [lindex $line2 1] " "]
set space_split [split $last_part_1 " "]
set comma_split [split $last_part_2 ","]
for {set i 0} {$i < [llength $space_split]} {incr i} {
puts $opfd "[string trim [lindex $comma_split $i] " "] = [string trim [lindex $space_split $i] " "]"
}
}
close $fd1
close $fd2
close $opfd
There will be file named as output_file.txt created inside current directory which contains your output.
Another way to do it:
package require fileutil
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
concat {*}[lmap item $rows {
split $item :
}]
}
set d1 [string map {F Fail} [getInput file1.txt]]
set d2 [string map {, {}} [getInput file2.txt]]
dict for {key values} $d1 {
puts $key
foreach v1 $values v2 [dict get $d2 $key] {
puts " $v2 - $v1"
}
}
This works by recognizing the dictionary-like structure of the data files. If every piece of data is a word without spaces, this version of getInput will coerce the contents of each file to a usable dict. From there, it's just a matter of replacing the F strings with Fail strings and removing the commas, and then doing dictionary iteration over either one of the dicts and pulling in the corresponding values from the other one.
If the values in the second file may contain spaces, getInput should look like this:
proc getInput filename {
set contents [string trim [::fileutil::cat $filename]]
set rows [split $contents \n]
set res {}
foreach item $rows {
lassign [split $item :] key values
if {[string match *,* $values]} {
set values [split [string trimright $values {, }] ,]
}
lappend res $key $values
}
return $res
}
Documentation: concat, dict, foreach, if, lassign, lmap, lmap replacement, package, proc, puts, return, set, split, string

How to parse txt file containing a repository of patterns

I am new in scripting in TCL, I want to parse a txt file to create a list of patterns based on 2 strings as input.
My file looks like:
keyw1: data1
keyw1: data2
keyw1: Arg1
:
:
keyword: Pattern2Extract
{
some_lines
keyw1: Arg1
keyw2: patternP1
{
some_lines
}
keyw2: Arg2
{
some_lines
}
keyw2: patternP2
{
some_lines
}
.
.
some_others blocks of declaration between braces {}
.
.
}
keyword: Pattern2Extract
{
some_lines
keyw1: Arg1
keyw2: Arg2
{
some_lines
}
keyw2: patternP1
{
some_lines
}
keyw2: patternP2
{
some_lines
}
.
.
some_others blocks of declaration between braces {}
.
.
}
So, I would like to output 2 list of 'Pattern2Extract'
list1: if Arg1 is found in structure grouped between curly braces {}
list2: if arg1 and arg2 are both in structure grouped between curly braces {}
I have tried lsearch and lindex and it's working for list1 but I don't know how to do it for list2.
Here is my script:
proc inst_nm {inpFile outFile} {
set chanId [open $inpFile r]
set data [list]
while {[gets $chanId line] != -1} {
lappend data $line
}
close $chanId
foreach dt $data {
set MasDat [lindex $dt 0]
set pinDat [lindex $dt 1]
}
set intId [open "./filetoparse.txt" r]
set instDat [list]
while {[gets $intId line] != -1} {
lappend instDat $line
}
close $intId
set writeId [open $outFile a]
set MasterList [lreplace [lsearch -all $instDat *$MasDat*] 0 0]
foreach elem $MasterList {
set cellLn [lindex [split [lindex $instDat $elem ] ":"] 1]
set instName [lindex [split [lindex $instDat [expr $elem -5]] ":"] 1]
set PinLn [lindex [split [lindex $instDat [expr $elem +1]] ":"] 1]
foreach ele $PinLn {
if {"$ele"=="$pinDat" } {
puts $writeId "$instName $pinDat $cellLn"
} else {
puts $writeId "$instName $ele $cellLn"
}
}
}
close $writeId
}
inst_nm [lindex $::argv 0] [lindex $::argv 1]
Currently, inpFile may have many lines like $MastDat $pinDat and I need to collect instDat corresponding to each pair ($MastDat,$pinDat).
in file_to_parse by construction, we know that instName come in fifth line before $MastDat. However, we don't know the position of line conatining $pinDat declaration and this pattern could be present or not into instance section:
keyword: Pattern2Extract { some_lines keyw1: Arg1 keyw2: patternP1 { some_lines } keyw2: Arg2 { some_lines } keyw2: patternP2 { some_lines } . . some_others blocks of declaration between braces {} . . }
so, in list2 we should get all insName in where $pinDat is found
Thank you for your help
It helps to break out the code into another proc. In Tcl the proc must be declared ahead of when you call it. The data file didn't reflect your parser and also the MasterList might be removing the found item your looking for. Below is your parser broken up with example files that reflect what it's doing.
#!/usr/bin/tclsh
proc findPin {MasDat pinDat instDat} {
# set MasterList to the list of indexes found for *$MastDat*
set MasterList [lsearch -glob -all $instDat *$MasDat*]
set found [list]
# for each index number in MasterList
foreach elem $MasterList {
# n-5 (key: value(instName))
# n-4
# n-3
# n-2
# n-1
# n (key: value(cellLn)
# n+1 (key: value(PinLn)
set cellLn [lindex [split [lindex $instDat $elem ] ":"] 1]
set instName [lindex [split [lindex $instDat [expr $elem -5]] ":"] 1]
set PinLn [lindex [split [lindex $instDat [expr $elem +1]] ":"] 1]
foreach ele $PinLn {
if {"$ele"=="$pinDat" } {
lappend found "$instName $pinDat $cellLn"
}
}
}
return $found
}
proc inst_nm {inpFile outFile} {
# geta all lines in filestoparse.txt
set intId [open "./filetoparse.txt" r]
set instDat [list]
while {[gets $intId line] != -1} {
lappend instDat $line
}
close $intId
set writeId [open $outFile a]
# Search each line in inpFile
set chanId [open $inpFile r]
while {[gets $chanId line] != -1} {
set MasDat [lindex $line 0]
set pinDat [lindex $line 1]
foreach {item} [findPin $MasDat $pinDat $instDat] {
puts $writeId $item
}
}
close $chanId
close $writeId
}
inst_nm [lindex $::argv 0] [lindex $::argv 1]
filetoparse.txt
INST_NAME:MyInst
unknown-1
unknown-2
unknown-3
unknown-4
CELL_LN:MyCellLn
PIN_LN:pin1 pin2 pin3 pin4 pin5
unknown...
INST_NAME:TestInst
unknown-1
unknown-2
unknown-3
unknown-4
CELL_LN:TestCell
PIN_LN:test1 test2 test3
inputfile.txt
MyCellLn pin4
MyCellLn pin25
TestCell test1
TestCell test10
MyCellLn pin3
Output:
% ./keylist.tcl inputfile.txt keylist_found.txt
% cat keylist_found.txt
MyInst pin4 MyCellLn
TestInst test1 TestCell
MyInst pin3 MyCellLn
Actually, I'm interested just by printing '$instName' for each pair line from inpFile '$cellLn $pinDat'
filetoparse.txt:
INST_NAME:Inst1
{
4 unknown lines
CELL_LN: Cell1
other unkown lines
PIN_LN:pin1
unkown
PIN_LN:pin5
unknown...
}
INST_NAME:Inst2
{
4 unknown lines
CELL_LN: Cell1
other unkown lines
PIN_LN:pin3
unkown
PIN_LN:pin5
unknown...
}
INST_NAME:Inst3
{
4 unknown lines
CELL_LN: Cell2
other unkown lines
PIN_LN:pin2
unkown
PIN_LN:pin4
unknown...
}
INST_NAME:Inst4
{
4 unknown lines
CELL_LN: Cell2
other unkown lines
PIN_LN:pin5
unkown
PIN_LN:pin2
unknown...
}
inpFile.txt
cell1 pin1
cell2 pin2
So, I want in OutputFile have something like:
- for cell1 pin1:
list1: {Inst1 Inst2}
list2: {Inst1}
- for cell2 pin2:
list1: {Inst3 Inst4}
list2: {Inst3 Inst4}
Thank you for your help,

How to get the data between two strings from a file in tcl?

In TCL Scripting:
I have a file in that i know how to search a string but how to get the line number when string is found.please answer me if it is possible
or
set fd [open test.txt r]
while {![eof $fd]} {
set buffer [read $fd]
}
set lines [split $buffer "\n"]
if {[regexp "S1 Application Protocol" $lines]} {
puts "string found"
} else {puts "not found"}
#puts $lines
#set i 0
#while {[regexp -start 0 "S1 Application Protocol" $line``s]==0} {incr i
#puts $i
#}
#puts [llength $lines]
#puts [lsearch -exact $buffer S1]
#puts [lrange $lines 261 320]
in the above program i am getting the output as string found .if i will give the string other than in this file i am getting string not found.
The concept of 'a line' is just a convention that we layer on top of the stream of data that we get from a file. So if you want to work with line numbers then you have to calculate them yourself. The gets command documnetion contains the following example:
set chan [open "some.file.txt"]
set lineNumber 0
while {[gets $chan line] >= 0} {
puts "[incr lineNumber]: $line"
}
close $chan
So you just need to replace the puts statement with your code to find the pattern of text you want to find and when you find it the value of $line gives you the line number.
To copy text that lies between two other lines I'd use something like the following
set chan [open "some.file.txt"]
set out [open "output.file.txt" "w"]
set lineNumber 0
# Read until we find the start pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "startpattern" $line]} {
# Now read until we find the stop pattern
while {[gets $chan line] >= 0} {
incr lineNumber
if { [string match "stoppattern" $line] } {
close $out
break
} else {
puts $out $line
}
}
}
}
close $chan
The easiest way is to use the fileutil::grep command:
package require fileutil
# Search for ipsum from test.txt
foreach match [fileutil::grep "ipsum" test.txt] {
# Each match is file:line:text
set match [split $match ":"]
set lineNumber [lindex $match 1]
set lineText [lindex $match 2]
# do something with lineNumber and lineText
puts "$lineNumber - $lineText"
}
Update
I realized that if the line contains colon, then lineText is truncated at the third colon. So, instead of:
set lineText [lindex $match 2]
we need:
set lineText [join [lrange $match 2 end] ":"]