i need a code to remove only a last newline character from a file in TCL.
suppose a file
aaa 11
bbb 12
cc 14
newline character
now how to remove that newline character from a file in TCl
please help me in this!
Seeking and truncating are your friend. (Requires Tcl 8.5 or later.)
set f [open "theFile.txt" r+]
# Skip to where last newline should be; use -2 on Windows (because of CRLF)
chan seek $f -1 end
# Save the offset for later
set offset [chan tell $f]
# Only truncate if we're really sure we've got a final newline
if {[chan read $f] eq "\n"} {
# Do the truncation!
chan truncate $f $offset
}
close $f
For removing data from anywhere other than the end of the file, it's easiest to rewrite the file (either by loading the data all into memory or by streaming and transforming to a new file that you move back over, the latter being harder but necessary with large files). Truncation can only work at the end.
Related
I am looking to generate a tcl script, which reads each line of a file, say abc.txt; each line of abc.txt is a specific location of set of files which need to be picked except the ones commented.
For example abc.txt has
./pvr.vhd
./pvr1.vhd
// ./pvr2.vhd
So I need to read each line of abc.txt and pick the file from the location it has mentioned and store it in a separate file except the once which starts with "//"
Any hint or script will be deeply appreciated.
The usual way of doing this is to put a filter at the start of the loop that processes each line that causes the commented lines to be skipped. You can use string match to do the actual detecting of whether a line is to be filtered.
set f [open "abc.txt"]
set lines [split [read $f] "\n"]
close $f
foreach line $lines {
if {[string match "//*" $line]} {
continue
}
# ... do your existing processing here ...
}
This also works just as well when used with a streaming loop (while {[gets $f line] >= 0} {…}).
let's say that I have opened a file using:
set in [open "test.txt" r]
I'm intend to revise some string in the certain line, like:
style="fill:#ff00ff;fill-opacity:1"
and this line number is: 20469
And I want to revise the value ff00ff to other string value like ff0000.
What are the proper ways to do this? Thanks in advance!
You need to open the file in read-write mode; the r+ mode is probably suitable.
In most cases with files up to a reasonable number of megabytes long, you can read the whole file into a string, process that with a command like regsub to perform the change in memory, and then write the whole thing back after seeking to the start of the file. Since you're not changing the size of the file, this will work well. (Shortening the file requires explicit truncation.)
set f [open "test.txt" r+]
set data [read $f]
regsub {(style="fill:#)ff00ff(;fill-opacity:1)"} $data {\1ff0000\2} data
seek $f 0
puts -nonewline $f $data
# If you need it, add this here by uncommenting:
#chan truncate $f
close $f
There are other ways to do the replacement; the choice depends on the details of what you're doing.
I'm trying to implement a tcl script which reads a text file, and masks all the sensitive information (such as passwords, ip addresses etc) contained it and writes the output to another file.
As of now I'm just substituting this data with ** or ##### and searching the entire file with regexp to find the stuff which I need to mask. But since my text file can be 100K lines of text or more, this is turning out to be incredibly inefficient.
Are there any built in tcl functions/commands I can make use of to do this faster? Do any of the add on packages provide extra options which can help get this done?
Note: I'm using tcl 8.4 (But if there are ways to do this in newer versions of tcl, please do point me to them)
Generally speaking, you should put your code in a procedure to get best performance out of Tcl. (You have got a few more related options in 8.5 and 8.6, such as lambda terms and class methods, but they're closely related to procedures.) You should also be careful with a number of other things:
Put your expressions in braces (expr {$a + $b} instead of expr $a + $b) as that enables a much more efficient compilation strategy.
Pick your channel encodings carefully. (If you do fconfigure $chan -translation binary, that channel will transfer bytes and not characters. However, gets is not be very efficient on byte-oriented channels in 8.4. Using -encoding iso8859-1 -translation lf will give most of the benefits there.)
Tcl does channel buffering quite well.
It might be worth benchmarking your code with different versions of Tcl to see which works best. Try using a tclkit build for testing if you don't want to go to the (minor) hassle of having multiple Tcl interpreters installed just for testing.
The idiomatic way to do line-oriented transformations would be:
proc transformFile {sourceFile targetFile RE replacement} {
# Open for reading
set fin [open $sourceFile]
fconfigure $fin -encoding iso8859-1 -translation lf
# Open for writing
set fout [open $targetFile w]
fconfigure $fout -encoding iso8859-1 -translation lf
# Iterate over the lines, applying the replacement
while {[gets $fin line] >= 0} {
regsub -- $RE $line $replacement line
puts $fout $line
}
# All done
close $fin
close $fout
}
If the file is small enough that it can all fit in memory easily, this is more efficient because the entire match-replace loop is hoisted into the C level:
proc transformFile {sourceFile targetFile RE replacement} {
# Open for reading
set fin [open $sourceFile]
fconfigure $fin -encoding iso8859-1 -translation lf
# Open for writing
set fout [open $targetFile w]
fconfigure $fout -encoding iso8859-1 -translation lf
# Apply the replacement over all lines
regsub -all -line -- $RE [read $fin] $replacement outputlines
puts $fout $outputlines
# All done
close $fin
close $fout
}
Finally, regular expressions aren't necessarily the fastest way to do matching of strings (for example, string match is much faster, but accepts a far more restricted type of pattern). Transforming one style of replacement code to another and getting it to go really fast is not 100% trivial (REs are really flexible).
Especially for very large files - as mentioned - it's not the best way to read the whole file into a variable. As soon as your system runs out of memory you can't prevent your app crashes. For data that is separated by line breaks, the easiest solution is to buffer one line and process it.
Just to give you an example:
# Open old and new file
set old [open "input.txt" r]
set new [open "output.txt" w]
# Configure input channel to provide data separated by line breaks
fconfigure $old -buffering line
# Until the end of the file is reached:
while {[gets $old ln] != -1} {
# Mask sensitive information on variable ln
...
# Write back line to new file
puts $new $ln
}
# Close channels
close $old
close $new
I can't think of any better way to process large files in Tcl - please feel free to tell me any better solution. But Tcl was not made to process large data files. For real performance you may use a compiled instead of a scripted programming language.
Edit: Replaced ![eof $old] in while loop.
A file with 100K lines is not that much (unless every line is 1K chars long :) so I'd suggest you read the entire file into a var and make the substitution on that var:
set fd [open file r+]
set buf [read $fd]
set buf [regsub -all $(the-passwd-pattern) $buf ****]
# write it back
seek $fd 0; # This is not safe! See potrzebie's comment for details.
puts -nonewline $fd $buf
close $fd
I am using a tcl script which takes a movie file trace and convert it into binary file which is further used by the application agent in ns-2. Here is the code snippet of the script which converts the movie file trace into binary file:
set original_file_name Verbose_Silence_of_the_Lambs_VBR_H263.dat
set trace_file_name video.dat
set original_file_id [open $original_file_name r]
set trace_file_id [open $trace_file_name w]
set last_time 0
while {[eof $original_file_id] == 0} {
gets $original_file_id current_line
if {[string length $current_line] == 0 ||
[string compare [string index $current_line 0] "#"] == 0} {
continue
}
scan $current_line "%d%s%d" next_time type length
set time [expr 1000*($next_time-$last_time)]
set last_time $next_time
puts -nonewline $trace_file_id [binary format "II" $time $length]
}
close $original_file_id
close $trace_file_id
But when I used this created video.dat file further for traffic generation used by application agent I got the following error:
Bad file siz in video.dat
Segmenatation fault
Kindly have a loot at this. what is the meaning of binary format "II" in the code. as I have not found it mentioned in tcl-binary(n) documentation or is it outdated and not supported now.
The problem is probably that you don't open your file in binary mode.
Change
set trace_file_id [open $trace_file_name w]
to
set trace_file_id [open $trace_file_name wb]
Otherwise Tcl will change the output, e.g. replaces \n with \r\n on windows.
(And for byte values > 127 it will be treated as unicode code point, then converted to your system encoding and thereby messing up your entire binary stuff)
While such things are fine for text files, it generates problems with binary files.
Fortunately only a single character is needed to fix that: b as modifier for open
Edit: I just looked up in the change list for Tcl, the b modifier for open was added with 8.5. I usually only use 8.5 or 8.6, so if you are using an older version of Tcl, add the following line after the open:
fconfigure $trace_file_id -translation binary
The b modifier is just a shortcut for that.
Hello i was wondering if its possible to read the last line of a realtime logfile with eggdrop and a .tcl script im able to read the first part of the logfile but thats it it doesnt read anymore of it
Is it possible to put an upper bound on the length of a line of the logfile? If so, it's pretty easy to get the last line:
# A nice fat upper bound!
set upperBoundLength 1024
# Open the log file
set f [open $logfile r]
# Go to some distance from the end; catch because don't care about errors here
catch {seek $f -$upperBoundLength end}
# Read to end, stripping trailing newline
set data [read -nonewline $f]
# Hygiene: close the logfile
close $f
# Get the last line
set lastline [lindex [split $data "\n"] end]
Note that it's not really necessary to do the seek; it just saves you from having to read the vast majority of the file which you presumably don't want.