My problem is simple: I'm trying to write a tcl script to use $grofile instead writing every time I need this file name.
So, what I did in TkConsole was:
% set grofile "file.gro"
% mol load gro ${grofile}
and, indeed, I succeeded uploading the file.
In the script I have the same lines, but still have this error:
wrong # args: should be "set varName ?newValue?"
can't read "grofile": no such variable
I tried to solve my problem with
% set grofile [./file.gro]
and I have this error,
invalid command name "./file.gro"
can't read "grofile": no such variable
I tried also with
% set grofile [file ./file.gro r]
and I got the first error, again.
I haven't found any simple way to avoid using the explicit name of the file I want to upload. It seems like you only can use the most trivial, but tedious way:
mol load file.gro
mol addfile file.xtc
and so on and so on...
Can you help me with a brief explanation about why in the TkConsole I can upload the file and use it as a variable while I can not in the tcl script?
Also, if you have where is my mistake, I will appreciate it.
I apologize if it is basic, but I could not find any answer. Thanks.
I add the head of my script:
set grofile "sim.part0001_protein_lipid.gro"
set xtcfile "protein_lipid.xtc"
set intime "0-5ms"
set system "lower"
source view_change_render.tcl
source cg_bonds.tcl
mol load gro $grofile xtc ${system}_${intime}_${xtcfile}
It was solved, thanks for your help.
You may think you've typed the same thing, but you haven't. I'm guessing that your real filename has spaces in it, and that you've not put double-quotes around it. That will confuse set as Tcl's general parser will end up giving set more arguments than it expects. (Tcl's general parser does not know that set only takes one or two arguments, by very long standing policy of the language.)
So you should really do:
set grofile "file.gro"
Don't leave the double quotes out if you have a complicated name.
Also, this won't work:
set grofile [./file.gro]
because […] is used to indicate running something as a command and using the result of that. While ./file.gro is actually a legal command name in Tcl, it's… highly unlikely.
And this won't work:
set grofile [file ./file.gro r]
Because the file command requires a subcommand as a first argument. The word you give is not one of the standard file subcommands, and none of them accept those arguments anyway, which look suitable for open (though that returns a channel handle suitable for use with commands like gets and read).
The TkConsole is actually pretty reasonable as quick-and-dirty terminal emulations go (given that it omits a lot of the complicated cases). The real problem is that you're not being consistently accurate about what you're really typing; that matters hugely in most programming languages, not just Tcl. You need to learn to be really exacting; cut-n-paste when creating a question helps a lot.
Related
proc pub:write { nick host handle channel arg } {
set fid [open /var/www/test.txt w]
puts $fid "█████████████████████████████████████████████████████████████████"
puts $fid "██"
close $fid
}
when i open i Webbrowser its Result so :
█████████████████████████████████████████████████████████████████
but it should :
█████████████████████████████████████████████████████████████████
Welcome to the yawning pit of complexity that is string encodings. You've got to get two things right to make what you're trying to do work. READ EVERYTHING BELOW BEFORE MAKING CHANGES as it all interacts horribly.
The character needs to be written to the file using the right encoding. This is done by configuring the encoding on the channel, which defaults to a system-specific value that is usually but not always right.
I'm taking a very wild guess that an encoding like “cp437 DOSLatinUS” is the right one.
fconfigure $fid -encoding cp437
However, Tcl's usually pretty good at picking the right thing to do by default.
Also, there's a huge number of different encodings. Some are very similar to each other and picking which one to use is a bit of a black art. The usual best bet is to stick with utf8 when possible, and otherwise to use the correct encoding (defined by protocol or by the system) and take a vast amount of care. This is really complicated!
You've also got to get the character into Tcl correctly in the first place. This means that the character has to be encoded in the source file, and Tcl has to read that file with the right encoding. Since the file is being written by another program (your editor usually) there's all sorts of potential for trouble. If you can discover what encoding is being used there (usually a matter of complete guesswork) then you can use the -encoding option to tclsh or source to allow Tcl to figure out what is going on.
Alternatively, stick with the ASCII subset in your source as that's pretty reliably handled the same whatever encoding is in use. You do this by converting each █ to the Tcl escape sequence \u2588. At least like that, you can be sure that you're only hunting down problems with the output encoding.
When debugging this thing, only change one thing at a time before retesting as there's a lot of bits that can go wrong and poison what is going on in ways that produce weird results downstream. I advise trying the escape sequence first as that at least means that you know that the input data is correct; once you know that you're not pushing garbage in, you can try hunting down whether you're actually getting problems with getting garbage out and what to do about it.
Finally, be aware that mixing in networking in this makes the problems about ten times harder…
I have the latest TCL build from Active State installed on a desktop and laptop both running Windows 10. I'm new to TCL and a novice developer and my reason for learning TCL is to enhance my value on the F5 platform. I figured a good first step would be to stop the occasional work I do in VBScript and port that to TCL. Learning the language itself is coming along alright, but I'm worried my project isn't viable due to performance. My VBScripts absolutely destroy my TCL scripts in performance. I didn't expect that outcome as my understanding was TCL was so "fast" and that's why it was chosen by F5 for iRules etc.
So the question is, am I doing something wrong? Is the port for Windows just not quite there? Perhaps I misunderstood the way in which TCL is fast and it's not fast for file parsing applications?
My test application is a firewall log parser. Take a log with 6 million hits and find the unique src/dst/port/policy entries and count them; split up into accept and deny. Opening the file and reading the lines is fine, TCL processes 18k lines/second while VBScript does 11k. As soon as I do anything with the data, the tide turns. I need to break the four pieces of data noted above from the line read and put in array. I've "split" the line, done a for-next to read and match each part of the line, that's the slowest. I've done a regexp with subvariables that extracts all four elements in a single line, and that's much faster, but it's twice as slow as doing four regexps with a single variable and then cleaning the excess data from the match away with trims. But even this method is four times slower than VBScript with ad-hoc splits/for-next matching and trims. On my desktop, i get 7k lines/second with TCL and 25k with VBscript.
Then there's the array, I assume because my 3-dimensional array isn't a real array that searching through 3x as many lines is slowing it down. I may try to break up the array so it's looking through a third of the data currently. But the truth is, by the time the script gets to the point where there's a couple hundred entries in the array, it's dropped from processing 7k lines/second to less than 2k. My VBscript drops from about 25k lines to 22k lines. And so I don't see much hope.
I guess what I'm looking for in an answer, for those with TCL experience and general programming experience, is TCL natively slower than VB and other scripts for what I'm doing? Is it the port for Windows that's slowing it down? What kind of applications is TCL "fast" at or good at? If I need to try a different kind of project than reading and manipulating data from files I'm open to that.
edited to add code examples as requested:
while { [gets $infile line] >= 0 } {
some other commands I'm cutting out for the sake of space, they don't contribute to slowness
regexp {srcip=(.*)srcport.*dstip=(.*)dstport=(.*)dstint.*policyid=(.*)dstcount} $line -> srcip dstip dstport policyid
the above was unexpectedly slow. the fasted way to extract data I've found so far
regexp {srcip=(.*)srcport} $line srcip
set srcip [string trim $srcip "cdiloprsty="]
regexp {dstip=(.*)dstport} $line dstip
set dstip [string trim $dstip "cdiloprsty="]
regexp {dstport=(.*)dstint} $line dstport
set dstport [string trim $dstport "cdiloprsty="]
regexp {policyid=(.*)dstcount} $line a policyid
set policyid [string trim $policyid "cdiloprsty="]
Here is the array search that really bogs down after a while:
set start [array startsearch uList]
while {[array anymore uList $start]} {
incr f
#"key" returns the NAME of the association and uList(key) the VALUE associated with name
set key [array nextelement uList $start]
if {$uCheck == $uList($key)} {
##puts "$key CONDITOIN MET"
set flag true
adduList $uCheck $key $flag2
set flag2 false
break
}
}
Your question is still a bit broad in scope.
F5 has published some comment why they choose Tcl and how it is fast for their specific usecases. This is actually a bit different to a log parsing usecase, as they do all the heavy lifting in C-code (via custom commands) and use Tcl mostly as a fast dispatcher and for a bit of flow control. And Tcl is really good at that compared to various other languages.
For things like log parsing, Tcl is often beaten in performance by languages like Python and Perl in simple benchmarks. There are a variety of reasons for that, here are some of them:
Tcl uses a different regexp style (DFA), which are more robust for nasty patterns, but slower for simple patterns.
Tcl has a more abstract I/O layer than for example Python, and usually converts the input to unicode, which has some overhead if you do not disable it (via fconfigure)
Tcl has proper multithreading, instead of a global lock which costs around 10-20% performance for single threaded usecases.
So how to get your code fast(er)?
Try a more specific regular expression, those greedy .* patterns are bad for performance.
Try to use string commands instead of regexp, some string first commands followed by string range could be faster than a regexp for these simple patterns.
Use a different structure for that array, you probably want either a dict or some form of nested list.
Put your code inside a proc, do not put it all in a toplevel script and use local variables instead of globals to make the bytecode faster.
If you want, use one thread for reading lines from file and multiple threads for extracting data, like a typical producer-consumer pattern.
So the thing that makes this whole question hard is that I am working in a bash shell environment. I am parsing a large amount of data that is all located in text files in a set of directories. The environment I am working in does not have a gui, and is just the shell, and I am executing the commands from the shell through mysql, I am not logged into mysql.
I am the partner on a project, the main part is a bash script that searches for information and inserts it into text files in several directories. My operations parse out the needed data and inserts it into the database.
I run my main loop through a shell script. It loops through a set of directories and searches for the .txt files in each. I then pass the information to my procedure. In something like the below.
NOTE: I am not an expert in bash and have just started learning.
mysql - user -p'mypassword' --database=dbname <<EFO
call Procedure_Name("`cat ${textfile}`");
EOF
Since I am working in mysql and bash only I can not use another language to make my life easier so I use SUBSTRING_INDEX mostly. So an illustration of the procedure is shown below.
DELIMITER $$
CREATE PROCEDURE Procedure_name(textfile LONGTEXT)
BEGIN
DECLARE data LONGTEXT;
SET data = SUBSTRING_INDEX(SUBSTRING_INDEX(textfile,"(+++)",1),"(++)",-1));
INSERT INTO Table_Name (column) values (data);
END; $$
DELIMITER ;
The text file is a clean structure that allows for me to cut it up, but the problem I am having is that special characters inside of the textfile is causing my procedure to throw an error. I believe they are escape characters and I need a way around this. Just about any character could appear in the data I am parsing so I need a way to ignore these characters in the procedure or to cause them to not affect my process.
I tried looking into mysql_real_escape_string() however the parameters were hard to figure out and it looks like it only works in PHP but I am not sure. So I would like to do something at the beginning of my procedure to maybe insert "\"'s or something into the string to not cause my procedure to fail.
Also, these textfiles range from 16k to 11000k so I need something that can handle that. My process works sometimes but is getting caught up on a lot of stuff and my searching has not helped me at all. So any help would be greatly appreciated!!!
and thanks to all to reading this long description. normally I can find my answer or piece it together from questions but I had no luck this time so I figured it was about time to make an account and ask something.
Your question is really too board, but here is an example of what I mean
a script file:
#!/bin/bash
case $# in
1 ) inFile=$1 ;;
* ) echo "usage: myLoader infile"; exit 1 ;;
esac
awk 'BEGIN {
FS="\t"'; OFS="|"
}
{
sub(/badChars/, "", $0); sub(/otherBads/, "", $0) ; # .... as many as needed
# but be careful, easy to delete stuff that with too broad a brush.
print $1, $2, $5, $4, $9
}' $inFile > $inFile.psv
bcp -in -f ${formatFile:-formatFile} $inFile.psv
Note how awk makes it very easy, by repeating sub(...) commands to remove any "bad chars" you may have in your source data AND to reorganize the order of the columns in your data. Each $n is the value in numbered column on a line, so $1, $2, $5 skips fields $3 and $4, for example.
The OFS is set to the pipe char, making it easy to see in your output where exactly the field boundaries are AND if there are any leading or trailing whitespace characters that may be throwing off your load.
The > $inFile.psv keeps your original file, just in case you make a mistake in the awk script.
If you create really small test data files, you can eliminate saving to a file and just let the output go to the screen, editing until you get it right.
You'll have to find out exactly how mySQL's equivalent of bcp works. I'm pretty sure I've seen postings here. Either that, or post a separate question, "I have this pipe-delimited file with 8 columns, how do I load it to my table?".
The reference in my sample code to ${formatFile} is that hopefully the mySQL bcp command can take a format file that specifies the order and types of fields to be loaded into a file. Good bcp fmt files allow a fair amount of flexibility, but you'll have to read the man page for that utility AND do some research to understand the scope and restraints on that flexibility.
Going forward, you should post individual questions like, "I've tried x using lang Y to filter Z characters. Right now I'm getting output z, What am I doing wrong?"
Divide and conquer. There is no easy way. Reset those customer and boss expectations, you're learning something new, and it will take a little study to get it right. Good luck.
IHTH
>> set signal_name [get_fanout abc_signal]
{xyz_blah_blah}
>> echo $signal_name
#142
>> set signal_name [get_fanout abc_signal]
{xyz_blah_blah}
>> echo $signal_name
#144
>>
I tried other stuff like catch etc, and every where, it returns #number. My goal is to be able to print the actual value instead of the number - xyz_blah_blah.
I am new to tcl. Want to understand, if this is an array or a pointer to an array or something like that. When I try the exact same thing with a different command, which returns just a value, then it works. This is a new command which returns value in parenthesis.
Please help. Thanks.
Every Tcl command produces a result value, which you capture and use by putting the call of the command in [square brackets] and putting the whole lot as part of an argument to another command. Thus, in:
set signal_name [get_fanout abc_signal]
the result of the call to get_fanout is used as the second argument to set. I suggest that you might also like to try doing this:
puts "-->[get_fanout abc_signal]<--"
It's just the same, except this time we're concatenating it with some other small string bits and printing the whole lot out. (In case you're wondering, the result of puts itself is always the empty string if there isn't an error, and set returns the contents of the variable.)
If that is still printing the wrong value (as well as the right one beforehand, without arrow marks around it) the real issue may well be that get_fanout is not doing what you expect. While it is possible to capture the standard output of a command, doing so is a considerably more advanced technique; it is probably better to consider whether there is an alternate mechanism to achieve what you want. (The get_fanout command is not a standard part of the Tcl language library or any very common add-on library like Tk or the Tcllib collection, so we can only guess at its behavior.)
Say I have a tcl script and I want to pass some arguments to the second script file which is being sourced in the first tcl:
#first tcl file
source second.tcl
I want to control the flow of second.tcl from first.tcl and I read that tcl source does not accept arguments. I wonder how I can do then.
source does not accept any additional arguments. But you can use (global) variables to pass arguments, e.g.:
# first tcl file
set ::some_variable some_value
source second.tcl
The second TCL file can reference the variable, e.g.:
# second tcl file
puts $::some_variable
Remark:
Sourcing a file means that the content of the sourced script is executed in the current context. That means that the sourced script has access to all variables existing in that context. The above code is the same as:
# one joint tcl file
set ::some_variable some_value
puts $::some_variable
Regarding the "::" thing -- see the explanation here (sorry, I don't have enough rep. to leave comments yet).
I should also add that the original question discusses a problem which appears to be quite odd: it seems that it could be better to provide a specific procedure in your second source file that would set up a state pertaining to what is defined by that script.
Something like:
source file2.tcl
setup_state $foo $bar $baz
Making [source] behave differently based on some global variables looks too obscure to me. Of course you might have legitimate reasons to do this, but anyway...