Reordering the contents of a file using tcl - tcl

I’m a newbie to Tcl and code writing in general. I have what seems like a simple coding issue that I have about 10 hrs into that I can’t seem to resolve.
I have a file that contains a list of nets clk123, n789, clk456, n246…. I need to reorder the list so that the clk* nets appear first when outputted. I can read the files in question and output the contents to the monitor or a file. But, I’m not able to find a way to reorder the list. I’ve spent so much time researching this that I’m now completely confused. Can someone offer a suggestion?

If there's just clk* and n* nets, a simple sort should be sufficient:
package require fileutil
proc sort data {
set lines [split $data \n]
set lines [lsort $lines]
join $lines \n
}
::fileutil::updateInPlace thefile.txt sort
Documentation: fileutil package, join, lsort, package, proc, set, split

Related

tcl: how to execute tcl command with custom list argument

I have this example tcl code that works by itself:
layout peek -celltext "CellA" "CellB"
I would like to have the a list storing CellA and CellB such that I want to execute something like
set list [list "CellA" "CellB"]
set OUTPUT [layout peek -celltext [exec echo $list]]
Unfortuantely, spending 2 hours and various combination isn't getting me anywhere. Any ideas on what I need to do? Forgive me as i'm not strong in Tcl programming.

How does Tcl "file rename {*}[glob *tcl] dir/ " operate

I am trying to move a large number of files using Tcl and came across the expression :
file rename {*}[glob *tcl] dir/ which works perfectly.
Can anyone explain how this command works or what this feature is called?
It's a compound of two commands and some useful syntax.
glob returns a list of filenames that match the pattern, *tcl in your case, or an error if nothing matches. There's a bunch of options you could use to modify what it returns, but you're not using any of them; that's great for your use case.
file rename will rename files or move files around. In particular, when the final argument is an existing directory name, the other arguments are files (or directories) that will be moved into that directory. (That it moves things around is sensible if you're familiar with how POSIX system calls work.)
The final piece of the puzzle is {*}[…], i.e., command expansion, which runs a command (which is glob *tcl in your case) and uses the elements of the list it returns as a sequence of arguments to the command call within which it is used. Which is useful; we want a list of filenames at that point of the call to file rename. There's no real limit on the number of arguments that can be moved around that way, other than basic things like memory and so on.
The {*} prefix (it's only special at the start of a word) can be used with other well-formed ways of producing a Tcl word (e.g., a read from a variable with $ or a literal with {…}) or even with a compound word, though use with compound words is usually a sign that what you're doing is probably unwise.
If you have old Tcl code, written for Tcl 8.4 or before, you won't see {*}. Instead, you'd see something like this:
eval file rename [glob *tcl] dir/
# Or, more properly, one of these horrors:
eval {file rename} [glob *tcl] {dir/}
eval [list file rename] [glob *tcl] [list dir/]
eval [linsert [linsert [glob *tcl] 0 file rename] end dir/]
These were notoriously awkward to get right in tricky cases (causing many subtle bugs). The expansion syntax was added in Tcl 8.5 exactly to get rid of this whole class of trouble. eval still exists in modern Tcl, but it is now thankfully rarely used.

In relative terms, how fast should TCL on Windows 10 be?

I have the latest TCL build from Active State installed on a desktop and laptop both running Windows 10. I'm new to TCL and a novice developer and my reason for learning TCL is to enhance my value on the F5 platform. I figured a good first step would be to stop the occasional work I do in VBScript and port that to TCL. Learning the language itself is coming along alright, but I'm worried my project isn't viable due to performance. My VBScripts absolutely destroy my TCL scripts in performance. I didn't expect that outcome as my understanding was TCL was so "fast" and that's why it was chosen by F5 for iRules etc.
So the question is, am I doing something wrong? Is the port for Windows just not quite there? Perhaps I misunderstood the way in which TCL is fast and it's not fast for file parsing applications?
My test application is a firewall log parser. Take a log with 6 million hits and find the unique src/dst/port/policy entries and count them; split up into accept and deny. Opening the file and reading the lines is fine, TCL processes 18k lines/second while VBScript does 11k. As soon as I do anything with the data, the tide turns. I need to break the four pieces of data noted above from the line read and put in array. I've "split" the line, done a for-next to read and match each part of the line, that's the slowest. I've done a regexp with subvariables that extracts all four elements in a single line, and that's much faster, but it's twice as slow as doing four regexps with a single variable and then cleaning the excess data from the match away with trims. But even this method is four times slower than VBScript with ad-hoc splits/for-next matching and trims. On my desktop, i get 7k lines/second with TCL and 25k with VBscript.
Then there's the array, I assume because my 3-dimensional array isn't a real array that searching through 3x as many lines is slowing it down. I may try to break up the array so it's looking through a third of the data currently. But the truth is, by the time the script gets to the point where there's a couple hundred entries in the array, it's dropped from processing 7k lines/second to less than 2k. My VBscript drops from about 25k lines to 22k lines. And so I don't see much hope.
I guess what I'm looking for in an answer, for those with TCL experience and general programming experience, is TCL natively slower than VB and other scripts for what I'm doing? Is it the port for Windows that's slowing it down? What kind of applications is TCL "fast" at or good at? If I need to try a different kind of project than reading and manipulating data from files I'm open to that.
edited to add code examples as requested:
while { [gets $infile line] >= 0 } {
some other commands I'm cutting out for the sake of space, they don't contribute to slowness
regexp {srcip=(.*)srcport.*dstip=(.*)dstport=(.*)dstint.*policyid=(.*)dstcount} $line -> srcip dstip dstport policyid
the above was unexpectedly slow. the fasted way to extract data I've found so far
regexp {srcip=(.*)srcport} $line srcip
set srcip [string trim $srcip "cdiloprsty="]
regexp {dstip=(.*)dstport} $line dstip
set dstip [string trim $dstip "cdiloprsty="]
regexp {dstport=(.*)dstint} $line dstport
set dstport [string trim $dstport "cdiloprsty="]
regexp {policyid=(.*)dstcount} $line a policyid
set policyid [string trim $policyid "cdiloprsty="]
Here is the array search that really bogs down after a while:
set start [array startsearch uList]
while {[array anymore uList $start]} {
incr f
#"key" returns the NAME of the association and uList(key) the VALUE associated with name
set key [array nextelement uList $start]
if {$uCheck == $uList($key)} {
##puts "$key CONDITOIN MET"
set flag true
adduList $uCheck $key $flag2
set flag2 false
break
}
}
Your question is still a bit broad in scope.
F5 has published some comment why they choose Tcl and how it is fast for their specific usecases. This is actually a bit different to a log parsing usecase, as they do all the heavy lifting in C-code (via custom commands) and use Tcl mostly as a fast dispatcher and for a bit of flow control. And Tcl is really good at that compared to various other languages.
For things like log parsing, Tcl is often beaten in performance by languages like Python and Perl in simple benchmarks. There are a variety of reasons for that, here are some of them:
Tcl uses a different regexp style (DFA), which are more robust for nasty patterns, but slower for simple patterns.
Tcl has a more abstract I/O layer than for example Python, and usually converts the input to unicode, which has some overhead if you do not disable it (via fconfigure)
Tcl has proper multithreading, instead of a global lock which costs around 10-20% performance for single threaded usecases.
So how to get your code fast(er)?
Try a more specific regular expression, those greedy .* patterns are bad for performance.
Try to use string commands instead of regexp, some string first commands followed by string range could be faster than a regexp for these simple patterns.
Use a different structure for that array, you probably want either a dict or some form of nested list.
Put your code inside a proc, do not put it all in a toplevel script and use local variables instead of globals to make the bytecode faster.
If you want, use one thread for reading lines from file and multiple threads for extracting data, like a typical producer-consumer pattern.

Difference tcl script tkconsole to load gro file in VMD

My problem is simple: I'm trying to write a tcl script to use $grofile instead writing every time I need this file name.
So, what I did in TkConsole was:
% set grofile "file.gro"
% mol load gro ${grofile}
and, indeed, I succeeded uploading the file.
In the script I have the same lines, but still have this error:
wrong # args: should be "set varName ?newValue?"
can't read "grofile": no such variable
I tried to solve my problem with
% set grofile [./file.gro]
and I have this error,
invalid command name "./file.gro"
can't read "grofile": no such variable
I tried also with
% set grofile [file ./file.gro r]
and I got the first error, again.
I haven't found any simple way to avoid using the explicit name of the file I want to upload. It seems like you only can use the most trivial, but tedious way:
mol load file.gro
mol addfile file.xtc
and so on and so on...
Can you help me with a brief explanation about why in the TkConsole I can upload the file and use it as a variable while I can not in the tcl script?
Also, if you have where is my mistake, I will appreciate it.
I apologize if it is basic, but I could not find any answer. Thanks.
I add the head of my script:
set grofile "sim.part0001_protein_lipid.gro"
set xtcfile "protein_lipid.xtc"
set intime "0-5ms"
set system "lower"
source view_change_render.tcl
source cg_bonds.tcl
mol load gro $grofile xtc ${system}_${intime}_${xtcfile}
It was solved, thanks for your help.
You may think you've typed the same thing, but you haven't. I'm guessing that your real filename has spaces in it, and that you've not put double-quotes around it. That will confuse set as Tcl's general parser will end up giving set more arguments than it expects. (Tcl's general parser does not know that set only takes one or two arguments, by very long standing policy of the language.)
So you should really do:
set grofile "file.gro"
Don't leave the double quotes out if you have a complicated name.
Also, this won't work:
set grofile [./file.gro]
because […] is used to indicate running something as a command and using the result of that. While ./file.gro is actually a legal command name in Tcl, it's… highly unlikely.
And this won't work:
set grofile [file ./file.gro r]
Because the file command requires a subcommand as a first argument. The word you give is not one of the standard file subcommands, and none of them accept those arguments anyway, which look suitable for open (though that returns a channel handle suitable for use with commands like gets and read).
The TkConsole is actually pretty reasonable as quick-and-dirty terminal emulations go (given that it omits a lot of the complicated cases). The real problem is that you're not being consistently accurate about what you're really typing; that matters hugely in most programming languages, not just Tcl. You need to learn to be really exacting; cut-n-paste when creating a question helps a lot.

TCL throws invalid command name when writing csv data to a matrix within a namespace

This is a bizarre issue that I can't seem to figure out. I am using TCL 8.5 and I am trying read data from a CSV file into matrix using the csv::read2matrix command. However, every time I do it, it says the matrix I am trying to write to is an invalid command. Snippet of what I am doing:
package require csv
package require struct::matrix
namespace eval ::iostandards {
namespace export *
}
proc iostandards::parse_stds { io_csv } {
# Create matrix
puts "Creating matrix..."
struct::matrix iostdm
# Add columns
puts "Adding columns to matrix..."
iostdm add columns 6
# Open File
set fid [open $io_csv r]
puts $fid
# Read CSV to matrix
puts "Reading data into matrix..."
csv::read2matrix $fid iostdm {,}
close $fid
}
When I run this code in a TCLSH, I get this error:
invalid command name "iostdm"
As far as I can tell, my code is correct (when I don't put it in a namespace. I tried the namespace import ::csv::* ::struct::matrix::* and it didn't do anything.
Is there something I am missing with these packages? Nothing on the wiki.tcl.tk website mentions anything of the sort, and all man packages for packages don't mention anything about being called within another namespace.
The problem is iostdm is defined inside the iostandards namespace. That means, it should be referenced as iostandards::iostdm, and that is how you should pass to csv::read2matrix:
csv::read2matrix $fid iostandards::iostdm {,}
Update
I noticed that you hard-coded adding 6 columns to the matrix before reading. A better way is to tell csv::read2matrix to expand the matrix automatically:
csv::read2matrix $fid iostandards::iostdm , auto
I want to add to Hai Vu's answer
From my testing, for commands such as csv::read2matrix and csv::write2matrix, if you have nested namespaces, it appears you have to go to the highest one.
I had a case where the structure was...
csv::read2matrix $fid ::highest::higher::high::medium::low::iostdm , auto