Does resolution variable elimination modify the solutions of other variables? - boolean-logic

So I have a cnf, and two lists of variables K and C.
The variables of K is added to the cnf as unit clauses (either negated or not depending on a boolean array) before sending it of to a sat-solver. When the sat-solver returns a model I only care about the variables that also occur in C and all other variables in the model is discarded.
Since the cnf is to be run repeatedly (with different variables from K set to true or false) it would be worth it to spend several hours to simplify the cnf and remove unnecessary variables (unnecessary is any variables that are not in either K or C) if it means shaving a few minutes off each time it is to be solved.
My question is whether I can use resolution variable elimination as described in this pdf or this bloggpost to eliminate some variables as long as I don't eliminate any variable that occurs in either K or C. Or if this will change the resulting model with respect to the variables in C?

Related

What does "cleanup" in NEXT_INST_F and NEXT_INST_V mean?

I am plowing TCL source code and get confused at macro NEXT_INST_F and NEXT_INST_V in tclExecute.c. Specifically the cleanup parameter of the macro.
Initially I thought cleanup means the net number of slots consumed/popped from the stack, e.g. when 3 objects are popped out and 1 object pushed in, cleanup is 2.
But I see INST_LOAD_STK has cleanup set to 1, shouldn't it be zero since one object is popped out and 1 object is pushed in?
I am lost reading the code of NEXT_INST_F and NEXT_INST_V, there are too many jumps.
Hope you can clarify the semantic of cleanup for me.
The NEXT_INST_F and NEXT_INST_V macros (in the implementation of Tcl's bytecode engine) clean up the state of the operand stack and push the result of the operation before going to the next instruction. The only practical difference between the two is that one is designed to be highly efficient when the number of stack locations to be cleaned up is a constant number (from a small range: 0, 1 and 2 — this is the overwhelming majority of cases), and the other is less efficient but can handle a variable number of locations to clean up or a number outside the small range. So NEXT_INST_F is basically an optimised version of NEXT_INST_V.
The place where macros are declared in tclExecute.c has this to say about them:
/*
* The new macro for ending an instruction; note that a reasonable C-optimiser
* will resolve all branches at compile time. (result) is always a constant;
* the macro NEXT_INST_F handles constant (nCleanup), NEXT_INST_V is resolved
* at runtime for variable (nCleanup).
*
* ARGUMENTS:
* pcAdjustment: how much to increment pc
* nCleanup: how many objects to remove from the stack
* resultHandling: 0 indicates no object should be pushed on the stack;
* otherwise, push objResultPtr. If (result < 0), objResultPtr already
* has the correct reference count.
*
* We use the new compile-time assertions to check that nCleanup is constant
* and within range.
*/
However, instructions can also directly manipulate the stack. This complicates things quite a lot. Most don't, but that's not the same as all. If you were to view this particular load of code as one enormous pile of special cases, you'd not be very wrong.
INST_LOAD_STK (a.k.a loadStk if you're reading disassembly of some Tcl code) is an operation that will pop an unparsed variable name from the stack and push the value read from the variable with that name. (Or an error will be thrown.) It is totally expected to pop one value and push another (from objResultPtr) since we are popping (and decrementing the reference count) of the variable name value, and pushing and incrementing the reference count of a different value that was read from the variable.
The code to read and write variables is among the most twisty in the bytecode engine. Far more goto than is good for your health.

Octave force deepcopy

The question
What are the ways of coercing octave to create a real copy of whatever object? Structures are the main interest.
My underlying problem
In my problem I'm obtaining a rather large structure from another function in a loop but for the current task only a few pieces of it are needed. For example:
for i=1:many
res=solver(params);
store1{i}=res.string1;
store2{i}=res.arr(:,1);
end
res is a sizable chunk of data and due to lazy-copy those store-s are references to tiny portions of bytes in that chunk. After I store those tiny portions, I don't need res itself, however, since middle of that chunk is referenced by store, the memory area is unfit for res obtained on the next iteration (they are of the same size) and thus another sizable piece of memory is allocated, which is then again crossed by few tiny links an so on.
Without storing parts of res, the program successfully keeps the memory consumption same after first couple of iterations.
So how do I make a complete copy of structure field?
I've tried using struct-related functions like rmfield but those keep references instead of their own objects.
I've tried to wrap the assignment of in its own function:
new_struct=copy( rmfield(old_struct,"bigdata"));
function c=copy(a);
c=a;
end;
This by the way doesn't work even for arrays.
I'm interested in method applicable to any generic variable.
Minimal working example of the problem
a=cell(3,1);
for i=1:length(a);
r=rand(100000,1000);
a{i}=r(1:100,end);
whos; fflush(stdout);
pause(2);
end;
The above code will cause memory usage to gradually grow by far more than 8.08 kb reported by whos due to references stored by a{i} blocking bigger memory block than they actually need. If you force the proper copy, the problem is not present.
Numerical arrays
For numeric types addition of zero is enough to warrant a new array.
c=a+0;
Strings
For string which is 1 x n char array, something along the following lines will work:
c=[a "a"](1:end-1);
Multidimensional char arrays will require concatenation with a column:
c=[a true(size(a,1),1)](:,1:end-1);
Here true is used to generate dummy array of size compatible with char. (There seems to be no procedural method of generating char array of arbitrary size) char(zeros(size(a,1),1)) and char(true(size(a,1),1)) caused excess memory usage during their creation on some calls.
Note that empty concatenation c=[a ""]; will not result in a copying. Also it is possible to do c=[a+0 ""]; which will result in a copying due to +0 but that one infers type conversions to and from double which is 8 times larger in size. (char(zeros( doesn't seem to cause that)
Other types
In general you can use casting for the types allowed by it in order to not tailor the expressions manually as I had to do above:
typelist={"double","single","char"}; %full list of supported types is available in the link
class_of_a = typelist{ isa(a,typelist) };
c=typecast( [typecast(a,'single'); single(1)] (1:end-1), class_of_a);
Single is seemingly smallest datatype available in octave.
Note that logical is not supported by this method.
Copying structures
Apparently you'd have to write your own function to go around struct fields, copy them with above methods and recursively go to substructs.
(As it doesn't involve complexities relevant here, I'd rather leave that to be done by those who actually needs that, my own problem being solved by +0's.)

Static chains and binding

I'm confused about how binding works for statically scoped variables in nested subroutines.
proc A:
var a, x
...
proc B:
var x, y
...
proc B2:
var a, b
...
end B2
end B
proc C:
var x, z, w
....
end C
end A
First, this is what I have understood: if static scoping is considered, then B2 can use the variable x and y present in its parent B. Similarly C can use the variable a used in proc A.
Now, my questions are: are these bindings made during the compile-time or run-time? Does it make a difference if the variables are statically scoped or dynamically scoped?
Until it comes naturally, I find it easy to draw environment model diagrams. They are also pretty much essential for exams and those esoteric examples that are intended to be confusing. I suggest the famous SICP (http://mitpress.mit.edu/sicp/), but there are obviously more than enough resources on the internet (a quick google brought me to this: http://www.icsi.berkeley.edu/~gelbart/cs61a/EnvDiagrams.pdf).
It depends on the language/implementation when/how bindings are done, however in your example the bindings can be done at compile time. In general, static scoping, as the name suggests allows for a lot of static/compile-time binding. A compiler can look into a function and see all references and resolve them immediately. For example in B2, a reference to y can be resolved immediately to belong to the enclosing scope, i.e. that of B.
As per dynamic vs. static scoping, there is a huge difference. Dynamic, as the name suggests, is much harder to do compile-time bindings with, since the structure of the code does not define the references to the variables. Different paths of execution may yield different bindings. You'll have to be more specific with the question though.

How to tell whether Haskell will cache a result or recompute it?

I noticed that sometimes Haskell pure functions are somehow cached: if I call the function twice with the same parameters, the second time the result is computed in no time.
Why does this happen? Is it a GHCI feature or what?
Can I rely on this (ie: can I deterministically know if a function value will be cached)?
Can I force or disable this feature for some function calls?
As required by comments, here is an example I found on the web:
isPrime a = isPrimeHelper a primes
isPrimeHelper a (p:ps)
| p*p > a = True
| a `mod` p == 0 = False
| otherwise = isPrimeHelper a ps
primes = 2 : filter isPrime [3,5..]
I was expecting, before running it, to be quite slow, since it keeps accessing elements of primes without explicitly caching them (thus, unless these values are cached somewhere, they would need to be recomputed plenty times). But I was wrong.
If I set +s in GHCI (to print timing/memory stats after each evaluation) and evaluate the expression primes!!10000 twice, this is what I get:
*Main> :set +s
*Main> primes!!10000
104743
(2.10 secs, 169800904 bytes)
*Main> primes!!10000
104743
(0.00 secs, 0 bytes)
This means that at least primes !! 10000 (or better: the whole primes list, since also primes!!9999 will take no time) must be cached.
primes, in your code, is not a function, but a constant, in haskellspeak known as a CAF. If it took a parameter (say, ()), you would get two different versions of the same list back if calling it twice, but as it is a CAF, you get the exact same list back both times;
As a ghci top-level definition, primes never becomes unreachable, thus the head of the list it points to (and thus its tail/the rest of the computation) is never garbage collected. Adding a parameter prevents retaining that reference, the list would then be garbage collected as (!!) iterates down it to find the right element, and your second call to (!!) would force repetition of the whole computation instead of just traversing the already-computed list.
Note that in compiled programs, there is no top-level scope like in ghci and things get garbage collected when the last reference to them is gone, quite likely before the whole program exits, CAF or not, meaning that your first call would take long, the second one not, and after that, "the future of your program" not referencing the CAF anymore, the memory the CAF takes up is recycled.
The primes package provides a function that takes an argument for (primarily, I'd claim) this very reason, as carrying around half a terabyte of prime numbers might not be what one wants to do.
If you want to really get down to the bottom of this, I recommend reading the STG paper. It doesn't include newer developments in GHC, but does a great job of explaining how Haskell maps onto assembly, and thus how thunks get eaten by strictness, in general.

Does Fortran preserve the value of internal variables through function and subroutine calls?

After much painful debugging, I believe I've found a unique property of Fortran that I'd like to verify here at stackoverflow.
What I've been noticing is that, at the very least, the value of internal logical variables are preserved across function or subroutine calls.
Here is some example code to illustrate my point:
PROGRAM function_variable_preserve
IMPLICIT NONE
CHARACTER(len=8) :: func_negative_or_not ! Declares function name
INTEGER :: input
CHARACTER(len=8) :: output
input = -9
output = func_negative_or_not(input)
WRITE(*,10) input, " is ", output
10 FORMAT("FUNCTION: ", I2, 2A)
CALL sub_negative_or_not(input, output)
WRITE(*,20) input, " is ", output
20 FORMAT("SUBROUTINE: ", I2, 2A)
WRITE(*,*) 'Expected negative.'
input = 7
output = func_negative_or_not(output)
WRITE(*,10) input, " is ", output
CALL sub_negative_or_not(input, output)
WRITE(*,20) input, " is ", output
WRITE(*,*) 'Expected positive.'
END PROGRAM function_variable_preserve
CHARACTER(len=*) FUNCTION func_negative_or_not(input)
IMPLICIT NONE
INTEGER, INTENT(IN) :: input
LOGICAL :: negative = .FALSE.
IF (input < 0) THEN
negative = .TRUE.
END IF
IF (negative) THEN
func_negative_or_not = 'negative'
ELSE
func_negative_or_not = 'positive'
END IF
END FUNCTION func_negative_or_not
SUBROUTINE sub_negative_or_not(input, output)
IMPLICIT NONE
INTEGER, INTENT(IN) :: input
CHARACTER(len=*), INTENT(OUT) :: output
LOGICAL :: negative = .FALSE.
IF (input < 0) THEN
negative = .TRUE.
END IF
IF (negative) THEN
output = 'negative'
ELSE
output = 'positive'
END IF
END SUBROUTINE sub_negative_or_not
This is the output:
FUNCTION: -9 is negative
SUBROUTINE: -9 is negative
Expected negative.
FUNCTION: 7 is negative
SUBROUTINE: 7 is negative
Expected positive.
As you can see, it appears that once the function or subroutine is called once, the logical variable negative, if switched to .TRUE., remains as such despite the initialization of negative to .FALSE. in the type declaration statement.
I could, of course, correct this problem by just adding a line
negative = .FALSE.
after declaring the variable in my function and subroutine.
However, it seems very odd to me that this be necessary.
For the sake of portability and code reusability, shouldn't the language (or compiler maybe) require re-initialization of all internal variables each time the subroutine or function is called?
To answer your question: Yes Fortran does preserve the value of internal variables through function and subroutine calls.
Under certain conditions ...
If you declare an internal variable with the SAVE attribute it's value is saved from one call to the next. This is, of course, useful in some cases.
However, your question is a common reaction upon first learning about one of Fortran's gotchas: if you initialise an internal variable in its declaration then it automatically acquires the SAVE attribute. You have done exactly that in your subroutines. This is standard-conforming. If you don't want this to happen don't initialise in the declaration.
This is the cause of much surprise and complaint from (some) newcomers to the language. But no matter how hard they complain it's not going to change so you just have to (a) know about it and (b) program in awareness of it.
This isn't too different from static function-scoped variables in C or C++.
Programming language design was in its infancy, back when FORTRAN was
developed. If it were being designed from scratch today, no doubt many of the design
decisions would have been different.
Originally, FORTRAN didn't even support recursion, there was no dynamic memory
allocation, programs played all sorts of type-punning games with COMMON blocks
and EQUIVALENCE statements, procedures could have multiple entry points....so the
memory model was basically for the compiler/linker to lay out everything, even local
variables and numeric literal constants, into fixed storage locations, rather than on
the stack. If you wanted, you could even write code that changed the value of "2" to
"42"!
By now, there is an awful lot of legacy FORTRAN code out there, and compiler writers go to great lengths to preserve backward-compatible semantics. I can't quote chapter and verse from the standard that mandates the behavior you've noted, nor its rationale, but it seems reasonable that backward compatibility trumped modern language design sensibilities, in this instance.
This has been discussed several times here, most recently at Fortran assignment on declaration and SAVE attribute gotcha
You don't have to discover this behavior by experimentation, it is clearly stated in the better textbooks.
Different languages are different and have different behaviors.
There is a historical reason for this behavior. Many compilers for Fortran 77 and earlier preserved the values of ALL local variables across calls of procedures. Programmers weren't supposed to rely upon this behavior but many did. According to the standard, if you wanted a local variable (non-COMMON) to retain its value you needed to use "SAVE". But many programmers ignored this. In that era programs were less frequently ported to different platforms and compilers, so incorrect assumptions might never be noticed. It is common to find this problem in legacy programs -- current Fortran compilers typically provide a compiler switch to cause all variables to be saved. It would be silly for the language standard to require that all local variables retain their values. But an intermediate requirement that would rescue many programs that were careless with "SAVE" would be to require all variables initialized in their declarations to automatically have the SAVE attribute. Hence what you discovered....