Removing elements from results of glob function in TCL - tcl

I am doing :
glob -nocomplain *
as a result I get 4 files:
a b c d
how can I remove from list b?
I am using this func:
proc lremove {args} {
if {[llength $args] < 2} {
puts stderr {Wrong # args: should be "lremove ?-all? list pattern"}
}
set list [lindex $args end-1]
set elements [lindex $args end]
if [string match -all [lindex $args 0]] {
foreach element $elements {
set list [lsearch -all -inline -not -exact $list $element]
}
} else {
# Using lreplace to truncate the list saves having to calculate
# ranges or offsets from the indexed element. The trimming is
# necessary in cases where the first or last element is the
# indexed element.
foreach element $elements {
set idx [lsearch $list $element]
set list [string trim \
"[lreplace $list $idx end] [lreplace $list 0 $idx]"]
}
}
return $list
}
however it does not working with glob results, but only with strings. please help.

That lreplace procedure is rather dodgy, really, what with swapping the order around, ghetto concatenation and string trim to try to clean up the mess. Yuck. Here's a simpler version (without support for -all, which you don't need for processing the output of glob as that's normally a list of unique elements anyway):
proc lremove {list args} {
foreach toRemove $args {
set index [lsearch -exact $list $toRemove]
set list [lreplace $list $index $index]
}
return $list
}
Let's test it!
% lremove {a b c d e} b d f
a c e
Theoretically it could be made more efficient, but it would take a lot of work and be a PITA to debug. This version is way easier to write and is obviously correct. It should also be substantially faster than what you were working with, as it sticks to purely list operations.
The results from glob shouldn't be particularly special that any unusual effort be required to work with them, but there were some really nasty historic bugs that made that not always true. The latest versions of 8.4 and 8.5 (i.e., 8.4.20 and 8.5.15) don't have the bugs. Nor does any release version of 8.6 (8.6.0 or 8.6.1). If stuff is behaving mysteriously, we'll get into asking about versions and telling you to not be quite so behind the times…

Related

How to copy exactly in tcl?

I created a list using:
set list1 { o\\/one o\\/two o\\/three }
now I want to copy this list to another list by adding { } to each item
my new list should become :
{ {o\\/one} {o\\/two} {o\\/three} }
I tried using
foreach a $list1 {
set x "{$a}"
append new_list " " "{$a}"
lappend new_list1 $x
}
newlist → {o\/one} {o\/two} {o\/three}
newlist1 → {{o\/one}} {{o\/two}} {{o\/three}}
Please help?
Your original list has these items in it (as you can verify with lindex):
puts [lindex $list1 0] → o\/one
puts [lindex $list1 1] → o\/two
puts [lindex $list1 2] → o\/three
Any list that has those elements in it, however encoded, is pairwise-equivalent. The canonical form (as produced by Tcl's own list operations) of the list is:
{o\/one} {o\/two} {o\/three}
Perhaps the easiest way of obtaining that is:
set list2 [lrange $list1 0 end]
The lrange command uses Tcl's standard list-to-string engine (shared with a great many other commands). That prefers to not add braces, but prefers adding braces to adding backslashes; backslashes are a last resort because they're ugly and hard to read. But it works with arbitrary contents in the elements; just blindly adding braces is vulnerable to tricky edge cases.
Another way of getting the above canonical form is this (provided you're not stuck on versions of Tcl so old they're no longer supported):
set list2 [list {*}$list1]
[EDIT]: If you've got a string with some things in it separated by spaces, you might want to convert it into a proper list; this is useful particularly when the input data contains list metacharacters like braces and (relevant in this case) backslashes. There are two main ways to do this:
set theList [split $inputString]
set theList [regexp -all -inline {\S+} $inputString]
They differ in what happens when the input string has two (or more) spaces between two words:
set inputString "a b c d"; # NB: two spaces between b and c
puts [split $inputString]; # ==> a b {} c d
puts [regexp -all -inline {\S+} $inputString]; # ==> a b c d
There are use-cases for both.

Converting Columns in a List in Tcl Script

I want to convert a column of a file in to list using Tcl Script. I have a file names "input.dat" with the data in two columns as follows:
7 0
9 9
0 2
2 1
3 4
And I want to convert the first column into a list and I wrote the Tcl Script as follows:
set input [open "input.dat" r]
set data [read $input]
set values [list]
foreach line [split $data \n] {
lappend values [lindex [split $line " "] 0]
}
puts "$values"
close $input
The result shows as: 7 9 0 2 3 {} {}
Now, my question is what is these two extra "{}" and what is the error in my script because of that it's producing two extra "{}" and How can I solve this problem?
Can anybody help me?
Those empty braces indicate empty strings. The file you used most probably had a couple empty lines at the end.
You could avoid this situation by checking a line before lappending the first column to the list of values:
foreach line [split $data \n] {
# if the line is not equal to blank, then lappend it
if {$line ne ""} {
lappend values [lindex [split $line " "] 0]
}
}
You can also remove those empty strings after getting the result list, but it would mean you'll be having two loops. Still can be useful if you cannot help it.
For example, using lsearch to get all the values that are not blank (probably simplest in this situation):
set values [lsearch -all -inline -not $values ""]
Or lmap to achieve the same (a bit more complex IMO but gives more flexibility when you have more complex situations):
set values [lmap n $values {if {$n != ""} {set n}}]
The first {} is caused by the blank line after 3 4.
The second {} is caused by a blank line which indicates end of file.
If the last blank line is removed from the file, then there will be only one {}.
If the loop is then coded in the following way, then there will be no {}.
foreach line [split $data \n] {
if { $line eq "" } { break }
lappend values [lindex [split $line " "] 0]
}
#jerry has a better solution
Unless intermittent empty strings carry some meaning important to your program's task, you may also use a transformation from a Tcl list (with empty-string elements) to a string that prunes empty-string elements (at the ends, and in-between):
concat {*}[split $data "\n"]

How I can get unmatched part of string using TCL?

I am comparing two strings, how I can get the part of string which did not match between these two
This is an interesting problem that requires a longest common subsequence algorithm. Tcl's got one of those already in Tcllib, but it's for lists. Fortunately, we can convert a string into a list of characters with split:
package require struct::list
set a "the quick brown fox"
set b "the slow green fox"
set listA [split $a ""]; set lenA [llength $listA]
set listB [split $b ""]; set lenB [llength $listB]
set correspondences [struct::list longestCommonSubsequence $listA $listB]
set differences [struct::list lcsInvertMerge $correspondences $lenA $lenB]
Now we can get the parts that didn't match up by picking the parts from the differences that are added, changed or deleted:
set common {}
set unmatchedA {}
set unmatchedB {}
foreach diff $differences {
lassign $diff type rangeA rangeB
switch $type {
unchanged {
lappend common [join [lrange $listA {*}$rangeA] ""]
}
added {
lappend unmatchedB [join [lrange $listB {*}$rangeB] ""]
}
changed {
lappend unmatchedA [join [lrange $listA {*}$rangeA] ""]
lappend unmatchedB [join [lrange $listB {*}$rangeB] ""]
}
deleted {
lappend unmatchedA [join [lrange $listA {*}$rangeA] ""]
}
}
}
puts common->$common
# common->{the } ow {n fox}
puts A->$unmatchedA
# A->{quick br}
puts B->$unmatchedB
# B->sl { gree}
In this case, we see the following correspondences (. is a spacer I've inserted to help line things up):
the quick br..ow.....n fox
the ........slow green fox
Whether this is exactly what you want, I don't know (and there's more detail in the computed differences; they're just a bit hard to read). You can easily switch to doing a word-by-word correspondence instead if that's more to your taste. It's pretty much just removing the split and join…
If you have a string and you want to remove a fixed substring, for example
set str "this is a larger? string"
set substr "a larger?"
Then you can do this:
set parts [split [string map [list $s2 \uffff] $s1] \uffff]
# returns the list: {this is } { string}
That globally replaces the substring within the larger string with a single character, then splits the result on that same character.

Replace several lines of commands with a single variable in tcl

I know I have been asking a lot of questions but I'm still learning tcl and I haven't found anything that similar to this issue anywhere so far. Is it at all possible to replace a set f commands in tcl with one variable function0 for example?
I want to be able to replace the following code;
set f [listFromFile $path1]
set f [lsort -unique $f]
set f [lsearch -all -inline $f "test_*"]
set f [regsub -all {,} $f "" ]
set len [llength $f]
set cnt 0
with a variable function0 because this same code appears numerous times within the script. I should mention it appears both in a proc and not in a proc
The above code relates to similar script as
while {$cnt < $len} {
puts [lindex $f $cnt]
incr cnt
after 25; #not needed, but for viewing purposes
}
Variables are for storing values. To hide away (encapsulate) some lines of code you need a command procedure, which you define using the proc command.
You wanted to hide away the following lines
set f [listFromFile $path1]
set f [lsort -unique $f]
set f [lsearch -all -inline $f "test_*"]
set f [regsub -all {,} $f "" ]
set len [llength $f]
set cnt 0
to be able to just invoke for instance function0 $path1 and have all those calculations made in one fell swoop. Further, you wanted to use the result of calling the procedure in code like this:
while {$cnt < $len} {
puts [lindex $f $cnt]
# ...
Which means you want function0 to produce three different values, stored in cnt, len, and f. There are several ways to have a command procedure return multiple values, but the cleanest solution here is to make it return a single value; the list that you want to print. The value in len can be calculated from that list with a single command, and the initialization of cnt is better performed outside the command procedure. What you get is this:
proc function0 path {
set f [listFromFile $path]
set f [lsort -unique $f]
set f [lsearch -all -inline $f test_*]
set f [regsub -all , $f {}]
return $f
}
which you can use like this:
set f [function0 $path1]
set len [llength $f]
set cnt 0
while {$cnt < $len} {
puts [lindex $f $cnt]
incr cnt
after 25; #not needed, but for viewing purposes
}
or like this:
set f [function0 $path1]
set len [llength $f]
for {set cnt 0} {$cnt < $len} {incr cnt} {
puts [lindex $f $cnt]
after 25; #not needed, but for viewing purposes
}
or like this:
set f [function0 $path1]
foreach item $f {
puts $item
after 25; #not needed, but for viewing purposes
}
This is why I didn't bother to create a procedure returning three values: you only really needed one.
glenn jackman makes a very good point (or two points, actually) in another answer about the use of regsub. For completeness, I will repeat it here.
Tcl is a bit confusing because it usually allows string operations (like string substitution) on data structures that aren't formally strings. This makes the language very powerful and expressive, but also means that newbies do not always get the kick in the shins that a regular type system would give them.
In this case you created a list structure inside listFromFile by reading a string from a file and then using split on it. From that point on it's a list and you should only perform list operations on it. If you wanted to take out all commas in your data you should either perform that operation on each item in the list, or else perform the operation inside listFromFile, before splitting the text.
String operations on lists will work, but sometimes the result will be garbled, so mixing them should be avoided. The other good point was that in this case string map is preferable to regsub, if nothing else it makes the code a bit clearer.
Documentation: for, foreach, lindex, llength, lsearch, lsort, proc, puts, regsub, set, split, string, while
(more of a comment than an answer, but I want the formatting)
One thing to be aware of: $f holds a list, then you use the string command regsub on it, then you treat the result of regsub as a list again.
Use list commands with list values. I'd replace the regsub command with
set f [lmap elem $f {string map {"," ""} $elem} ]
for Tcl version 8.5 or earlier, you could do this:
for {set i 0} {$i < [llength $f]} {incr i} {
lset f $i [string map {, ""} [lindex $f $i]]
}

Combinations of all charcaters and all lengths with using less number of loops?

Brain Teaser: I self originated this question, but stuck completely.
I want to create all possible combination of all characters, but of all possible lengths. Suppose, [a-z] combination of 1 length, then [a-z] combination of 2 length, and so on till the maximum length achieved.
this could be very easily done by iterative looping.
Example for 3 length:
proc triples list {
foreach i $list {
foreach j $list {
foreach k $list {
puts [list $i $j $k]
}
}
}
}
But, it should solve using less loops (looping needs to be dynamic)
set chars "abcdefghijklmnopqrstuvwxyz"
set chars [split $chars ""]
set complete_length [llength $chars]
set start 0
set maximum_length 15
while {1} {
if {$start > $maximum_length} {
break
}
for {set i [expr $maximum_length-$start]} {$i >= 0} {incr i -1} {
# dump combinations
}
incr start
}
In this chunk, what algorithm or method i should apply? Any kind of suggestions/help/code will be appreciated.
Sry, this is not an answer, but hopefully some interesting discussion anyway:
The word "combinations" is often used way too generally, so it can be interpreted in many different ways. Let's say that you have a source list of 26 different elements, the english letters, and you want to pick 3 of them and combine in a 3 element destination list:
Can you always pick any letter from the source list, or do the elements disappear from it as you pick them? Either define "pick" (are the elements copied or moved during a pick), or define the set of source values (is there 1 of each of A-Z or an infinite amount of A-Z).
Does the order in the destination list matter? Is AHM considered to be the same combination as HAM? Define "combine".
If you have a list where not all elements are different, e.g. {2 10 10 64 100}, you have even more possibilities. Define your set of values.
Your first example prints permutations, not combinations. If that's what you want, the easiset way is a recursive procedure. Combinations are more complicated to generate.
EDIT:
I wrote this procedure for a Project Euler program. It picks all the elements, but maybe you can modify it to pick n. It takes a command prefix as argument, so you don't have to store all permutations.
package require Tcl 8.5.0
proc forEachPerm {list cmdPrefix} {
_forEachPerm {} $list $cmdPrefix
}
proc _forEachPerm {head list cmdPrefix} {
if {![llength $list]} {
{*}$cmdPrefix $head
} else {
for {set i 0} {$i < [llength $list]} {incr i} {
_forEachPerm [concat $head [lrange $list $i $i]] [lreplace $list $i $i] $cmdPrefix
}
}
}
# example use:
forEachPerm {a b c} {apply {{list} {puts [join $list]}}}