I am dealing with a "large" measurement data, approximately 30K key-value
pairs. The measurements have number of iterations. After each iteration a
datafile (non-csv) with 30K kay-value pairs is created. I want to somehow
creata a csv file of form:
Key1,value of iteration1,value of iteration2,...
Key2,value of iteration1,value of iteration2,...
Key2,value of iteration1,value of iteration2,...
...
Now, I was wondering about efficient way of adding each iteration mesurement
data as a columns to csv file in Tcl. So, far it seems that in either case I
will need to load whole csv file into some variable(array/list) and work on
each element by adding new measurement data. This seems somewhat inefficient.
Is there another way, perhaps?
Because CSV files are fundamentally text files, you have to load all the data in and write it out again. There's no other way to expand the number of columns since the data is fundamentally row-major. The easiest way to do what you say you want (after all, 30k-pairs isn't that much) is to use the csv package to do the parsing work. This code might do what you're looking for…
package require csv
package require struct::matrix
# Load the file into a matrix
struct::matrix data
set f [open mydata.csv]
csv::read2matrix $f data , auto
close $f
# Add your data
set newResults {}
foreach key [data get column 0] {
lappend newResults [computeFrom $key]; # This is your bit!
}
data add column $newResults
# Write back out again
set f [open mydata.csv w]
csv::writematrix data $f
close $f
You would probably be better off using a database though. Both metakit and sqlite3 work very well with Tcl, and handle this sort of task well.
Related
I have many files of process and it contain versions, I have regexp certain line of them and import them into a txt file , the txt format is like
#process #AA_version #BB_version
a11 Aa/10.10-d87_1 Bb/10.57-d21_1
a15 Aa/10.15-d37_1 Bb/10.57-d28_1
a23 Aa/10.20-d51_1 Bb/10.57-d29_3
and then I change the txt to csv format like
,#process,#AA_version,#BB_version
,a11,Aa/10.10-d87_1,Bb/10.57-d21_1
,a15,Aa/10.15-d37_1,Bb/10.57-d28_1
,a23,Aa/10.20-d51_1,Bb/10.57-d29_3
And now I want to write a tcl(get_version.tcl) it can generate the corresponding version (in the file where I regexp the version) after parsing the csv file
ex. If I tclsh get_version.tcl to parse csv file, input the process I want to modify(a11) and it will puts the current version AA_version: Aa/10.10-d87_1 BB_version: Bb/10.57-d21_1 and I can modify the version
and the get_version.tcl can update the csv file and the file (where I regexp) at the same time.
Is that available? and I cannot install tcllib , can I do these thing without tcllib?
It contains too much commands, I don't know how to start, Thanks for help~
If I want to change the AA_version of process a11, and use
get_version.tcl a11
AA_version = **Aa/10.10-d87_1**
BB_version = **Bb/10.57-d21_1**
It will read the csv file and add the new version(ex.Aa/10.13-d97_1 ), and this action will change the file (1.add a new line AA_version =Aa/10.13-d97_1 2. modify AA_version = Aa/10.10-d87_1 ->#AA_version = Aa/10.10-d87_1) in original file
I'd process each line something like this:
split [regsub -all {\s+} $line "\u0000"] "\u0000"
(The tcllib textutil::splitx command does something similar.)
If you don't have commas or quotes in your input data — and it looks like you might be able to guarantee that easily — then you can just join $record "," to get a line you can write out with puts. If you do have commas or quotes or stuff like that, use the Tcllib csv package because that handles the tricky edge cases correctly.
As a simple stdin→stdout filter, the script would be:
while {[gets stdin line] >= 0} {
set data [split [regsub -all {\s+} $line "\u0000"] "\u0000"]
# Any additional column mangling here
puts [join $data ","]
}
I am new to TCL language and wish to know how can I do the following process. Assume I have a program that creates a text file per single run and should be run for 10000 times. Every single run creates and text file called "OUT.out". All I am interested is a single number in a specific column from that OUT.out file in a single run.
Ideal case for a single run should be as following:
Start the main Run, (should be repeated for 10000 times, assumed)
Run Case 1
Finish the Case 1
Open the text file, OUT.out.
Find the maximum absolute value in the 4th column of the text file.
Save the max value in a separate text file in row 1.
delete the OUT.out file
Run Case 2
Finish the Case 2 of the main loop
Open the text file, OUT.out.
Find the maximum absolute value in the 4th column of the text file.
Save the max value in a separate text file in row 2.
delete the OUT.out file
Run Case 3
Finish the Case 3 of the main loop
Open the text file, OUT.out.
Find the maximum absolute value in the 4th column of the text file.
Save the max value in a separate text file in row 3.
delete the OUT.out file
Run Case 4
.
.
.
I presume code should be shorted that my note. Thanks in advance for your help.
Depending on what the separator is, you might do:
# Read in the data and list-ify it; REAL data is often messier though
set f [open OUT.out]
set table [lmap row [split [read $f] "\n"] {split $row}]
close $f
# Kill that unwanted file
file delete OUT.out
# Tcl indexes start at 0
set col4abs [lmap row $table {
expr { abs([lindex $row 3]) }
}]
# Get the maximum of a list of values
set maxAbs [tcl::mathfunc::max {*}$col4abs]
# You don't say what file to accumulate maximums in
set f [open accumulate.out "a"]; # IMPORTANT: a == append mode
puts $f $maxAbs
close $f
and then repeat that after each run. I'm sure you can figure out how to do that bit.
But if you're doing this a lot, you probably should look into storing the results in a database instead; they're much better suited for this sort of thing than a pile of ordinary files. (I can thoroughly recommend SQLite; we moved our bulk result data management into it and greatly improved our ability to manage things, and that's keeping lots of quite big binary blobs as well as various chunks of analysable metadata.)
Please help me with the script which outputs the file that contains names of the files in subdirectories and its memory in bytes, the arguement to the program is the folder path .output file should be file name in 1st column and its memory in second column
Note:folder contains subfolders...inside subfolders there are files
.I tried this way
set fp [open files_memory.txt w]
set file_names [glob ../design_data/*/*]
foreach file $file_names {
puts $fp "$file [lindex [exec du -sh $file] 0]"
}
close $fp
Result sample:
../design_data/def/ip2.def.gz 170M
../design_data/lef/tsmc13_10_5d.lef 7.1M
But i want only file name to be printed that is ip2.def.gz , tsmc13_10_5d.lef ..etc(not the entirepath) and file memorry should be aligned
TCL
The fileutil package in Tcllib defines the command fileutil::find, which can recursively list the contents of a directory. You can then use foreach to iterate over the list and get the sizes of each of them with file size, before producing the output with puts, perhaps like this:
puts "$filename\t$size"
The $filename is the name of the file, and the $size is how large it is. You will have obtained these values earlier (i.e., in the line or two before!). The \t in the middle is turned into a TAB character. Replace with spaces or a comma or virtually anything else you like; your call.
To get just the last part of the filename, I'd do:
puts $fp "[file tail $file] [file size $file]"
This does stuff with the full information about the file size, not the abbreviated form, so if you really want 4k instead of 4096, keep using that (slow) incantation with exec du. (If the consumer is a program, or a programmer, writing out the size in full is probably better.)
In addition to Donal's suggestion, there are more tools for getting files recursively:
recursive_glob (from the Tclx package) and
for_recursive_glob (also from Tclx)
fileutil::findByPattern (from the fileutil package)
Here is an example of how to use for_recursive_glob:
package require Tclx
for_recursive_glob filename {../design_data} {*} {
puts $filename
}
This suggestion, in combination with Donal's should be enough for you to create a complete solution. Good luck.
Discussion
The for_recursive_glob command takes 4 arguments:
The name of the variable representing the complete path name
A list of directory to search for (e.g. {/dir1 /dir2 /dir3})
A list of patterns to search for (e.g. {*.txt *.c *.cpp})
Finally, the body of the for loop, where you want to do something with the filename.
Based on my experience, for_recursive_glob cannot handle directories that you don't have permission to (i.e. on Mac, Linux, and BSD platforms, I don't know about Windows). In which case, the script will crash unless you catch the exception.
The recursive_glob command is similar, but it returns a list of filenames instead of structuring in a for loop.
This is a bizarre issue that I can't seem to figure out. I am using TCL 8.5 and I am trying read data from a CSV file into matrix using the csv::read2matrix command. However, every time I do it, it says the matrix I am trying to write to is an invalid command. Snippet of what I am doing:
package require csv
package require struct::matrix
namespace eval ::iostandards {
namespace export *
}
proc iostandards::parse_stds { io_csv } {
# Create matrix
puts "Creating matrix..."
struct::matrix iostdm
# Add columns
puts "Adding columns to matrix..."
iostdm add columns 6
# Open File
set fid [open $io_csv r]
puts $fid
# Read CSV to matrix
puts "Reading data into matrix..."
csv::read2matrix $fid iostdm {,}
close $fid
}
When I run this code in a TCLSH, I get this error:
invalid command name "iostdm"
As far as I can tell, my code is correct (when I don't put it in a namespace. I tried the namespace import ::csv::* ::struct::matrix::* and it didn't do anything.
Is there something I am missing with these packages? Nothing on the wiki.tcl.tk website mentions anything of the sort, and all man packages for packages don't mention anything about being called within another namespace.
The problem is iostdm is defined inside the iostandards namespace. That means, it should be referenced as iostandards::iostdm, and that is how you should pass to csv::read2matrix:
csv::read2matrix $fid iostandards::iostdm {,}
Update
I noticed that you hard-coded adding 6 columns to the matrix before reading. A better way is to tell csv::read2matrix to expand the matrix automatically:
csv::read2matrix $fid iostandards::iostdm , auto
I want to add to Hai Vu's answer
From my testing, for commands such as csv::read2matrix and csv::write2matrix, if you have nested namespaces, it appears you have to go to the highest one.
I had a case where the structure was...
csv::read2matrix $fid ::highest::higher::high::medium::low::iostdm , auto
My Tcl application should read and store a lot of configurations parameters. I'd like to use regular disk file as a storage rather than registry or something else.
It would be great to store parameters hierarchically. All my parameters are strings, numbers, and lists of them. Configuration file(s) may be placed in directory (not only user's home). Normally application expects configuration file in the current directory.
Do you know any ready-to-use Tcl library?
More general question: what is the "Tcl-way" to read/write application configuration?
Thanks.
If the configuration does not necessarily need to be human-readable, I suggest you consider Sqlite -- it began as a Tcl extension, and therefore Tcl's Sqlite bindings are more mature than any other language's.
See: http://www.sqlite.org/tclsqlite.html
If you don't need random access (that is, configuration files are not huge and each can be slurped completely at once) and don't require processing by external tools, you could just use flat text files containing, say, Tcl lists. The "trick" is that in Tcl each value must have a valid string representation (when asked) and can be reconstructed from its string representation. You get that for free, that is, no special package is required and all you have to provide is some sort of structure to bind serialized values to their names.
To demonstrate:
set a "a string"
set b 536
set c {this is a list {with sublist}}
proc cf_write {fname args} {
set fd [open $fname w]
chan config $fd -encoding utf-8
set data [list]
foreach varName $args {
upvar 1 $varName var
lappend data [list $varName $var]
}
puts $fd $data
close $fd
}
proc cf_read fname {
set fd [open $fname]
chan config $fd -encoding utf-8
set data [read $fd]
close $fd
set data
}
set cfile [file join [file dir [info script]] conf.txt]
cf_write $cfile a b c
foreach entry [cf_read $cfile] {
lassign $entry name value
puts "$name: $value"
}
You'll get this output:
a: a string
b: 536
c: this is a list {with sublist}
Now if you feel like having something more fancy or "interoperable", look at YAML or JSON (you'll need to write a serializer for this one though) or INI formats--all available from Tcllib and hence are plain Tcl.
Even more fancier could be using XML via TDOM (an expat-based C extension). SQLite, which has been already proposed, is even more capable than that (provides random access to the data, is able to operate on huge data arrays). But it seems that for your task these tools appear to be too heavy-weight.
Note that my example deliberately opts to show how to store/restore an arbitrary ad-hoc list of variables so the cf_write procedure builds the Tcl list to be stored by itself. Of course, no one prevents you from building one yourself, providing for creation of hierarchical structures of arbitrary complexity. One caveat is that in this case you might (or might not) face a problem of deconstructing the restored list. But if you'll stick to a general rule of each element being a name/value pair as in my example, the deconstruction shouldn't be hard.
tcllib contains a package inifile for handling windows .ini file format configuration files. As it's part of tcllib it should be avaialble on all platforms (I've just checked and it loads ok on my Solaris 8 box). It allows you to both read and write .ini files and access the configuration by section and key.