Simplifying if statement logic - language-agnostic

I have seperated out a test to determine if two schedule items overlap because of the unreadability of it.
Is there any application to assist in simplifying a logic statement?
Example: (originally a faulty example, but exposes reasons I request this)
if (x < y && y < z && x < z)
could be reduced to
if (x < y && y < z)
My code is:
return (shift.Start <= shift2.Start && shift.End >= shift2.End) || (shift2.Start <= shift.Start && shift2.End >= shift.Start)
I would love to make that simpler, and I believe it's possible, just unsure how.
Seeing as this is really language agnostic, even converting to a different script to find possibilities would be nice, no need for it to be in C# for instance.

Kill the duplicate logic and you'll kill two birds with one stone. You'll get DRY, and you'll get a function name (the rich man's comment):
class Shift:
def encompasses(other_shift)
self.start <= other_shift.start && self.end >= other_shift.end
...
return shift1.encompasses(shift2) || shift2.encompasses(shift1)

Be very, very careful with these kinds of changes. They might seem straightforward at inital glance, and boolean logic (and DeMorgan's laws) are not too hard to grasp, but there are often potential gotchas when you look at individual cases:
For example: if (x < y && y < z) could be simplified to if (x < z)
This is not correct, if (x < z), y might still be greater than z. That state would not pass your original tests.

Although x < y && y < z implies x < z (< is transitive), the reverse is not true so the expressions are not equivalent. Indeed if y were defined as, say, Integer y = null, then the former may even cause an NPE in Java, or UNKNOWN in SQL.

When it comes to complex logic statements, you're usually best off with formatting your code in a readable manner than attempting some premature optimization (root of all evil, etc.)
For example:
return (shift.Start <= shift2.Start && shift.End >= shift2.End) || (shift2.Start <= shift.StartTime && shift2.End >= shift.Start)
Can, for readability and maintainability, be refactored to:
bool bRetVal = false;
bRetVal = ( ( (shift.Start <= shift2.Start)
&& (shift.End >= shift2.End))
|| ( (shift2.Start <= shift.StartTime)
&& (shift2.End >= shift.Start)))
return bRetVal;
Most places maintain a coding standard that defines something like the above for large logic blocks. I'd much rather maintain a few extra lines of code that can be read and understood than a one line monstrosity.

Is there any application to assist in simplifying a logic statement?
I didn't see anyone addressing this part of the question, So I'll take a stab and see what discussion occurs.
There are techniques for working with boolean logic. Back in my college days (BSEE) we used Karnaugh maps. Basically, you can take a very complex arbitrary truth table and determine a correct and optimized logical expression. We used this to reduce the number of logic gates in a circuit, which is analogous to simplifying a complex if statement.
Pros:
You can implement/optimize a very complex and arbitrary truth table relatively easily.
Cons:
The resulting logic usually bore little resemblance to the intent of the truth table. As others have suggested, this is "unreadable".
A change to a single cell of the truth table would often result in a completely different expression. A simple design tweak becomes a re-write, so it's unmaintainable.
Non-optimized logic is a lot cheaper than it used to be, while the design costs are the same.
Ultimately, the most critical thing is the correctness of the truth table/logic expression. An error means your program won't work right. No application or design technique will help if you don't properly understand the definition of the logic that needs to be implemented.
In my opinion, few real-world problems are sufficiently complex to truly benefit from this type of technique, but it does exist.

You need to be really careful when doing this... the example you gave, for example, simply isn't true...
if x = 1, y = 2, and z = 2, then x < y = true, and x < z = true, but y < z = false.
For this type of reasoning, you really want to go for code readability in these circumstances and not worry about the most efficient code that you can get.

Assuming Start and StartTime are actually supposed to be the same field, your condition boils down to
(a <= b && c >= d) || (b <= a && d >= c)
We can turn this into
(a <= b && d <= c) || (b <= a && c <= d)
but this still doesn't look like it simplifies much.

Sometimes you can wrap statements such as:
shift.Start <= shift2.Start && shift.End >= shift2.End
Into a boolean function to make it more readable such as:
function ShiftWithinValidRange //(terrible name here, but you get the idea)
{
return (shift.Start <= shift2.Start && shift.End >= shift2.End);
}

Not only is this dangerous, but it often leads to harder to maintain code. Boolean logic is easier to understand when broken down into specific steps. Condensing the logic will often lead to harder to understand logic.
i.e. in your example, why are we checking if x < z , when what we really want to know is x < y && y < z?
The simplest solution is often the best one. Condensing your logic into 'cooler' but less readable code is not good in the long run.

I don't have a general solution for you, but if I use Lisp syntax it looks a lot simpler to me:
(and (< x y)
(< y z)
(< x z))
Then notice that the first two clauses:
(and (< x y)
(< y z))
can be combined into:
(and (< x y z))
So the full expression now looks like:
(and (< x y z)
(< x z))
Now it's obvious that the second one is redundant, so it's down to:
(and (< x y z))
or simply:
(< x y z)
which in C-syntax is:
(x < y && y < z)

I think Wayne Conrad's answer is the right one, but for entertainment purposes only, here's another way to say it (I think):
(long) shift.Start.CompareTo(shift2.Start) * (long) shift.End.CompareTo(shift2.End) <= 0
Is this actually faster? I don't know. It's certainly harder to read.

Related

Can I define a maxima function f(x) which assigns to the argument x

Sorry for the basic question, but it's quite hard to find too much discussion on Maxima specifics.
I'm trying to learn some Maxima and wanted to use something like
x:2
x+=2
which as far as I can tell doesn't exist in Maxima. Then I discovered that I can define my own operators as infix operators, so I tried doing
infix("+=");
"+=" (a,b):= a:(a+b);
However this doesn't work, as if I first set x:1 then try calling x+=2, the function returns 3, but if I check the value of x I see it hasn't changed.
Is there a way to achieve what I was trying to do in Maxima? Could anyone explain why the definition I gave fails?
Thanks!
The problem with your implementation is that there is too much and too little evaluation -- the += function doesn't see the symbol x so it doesn't know to what variable to assign the result, and the left-hand side of an assignment isn't evaluated, so += thinks it is assigning to a, not x.
Here's one way to get the right amount of evaluation. ::= defines a macro, which is just a function which quotes its arguments, and for which the return value is evaluated again. buildq is a substitution function which quotes the expression into which you are substituting. So the combination of ::= and buildq here is to construct the x: x + 2 expression and then evaluate it.
(%i1) infix ("+=") $
(%i2) "+="(a, b) ::= buildq ([a, b], a: a + b) $
(%i3) x: 100 $
(%i4) macroexpand (x += 1);
(%o4) x : x + 1
(%i5) x += 1;
(%o5) 101
(%i6) x;
(%o6) 101
(%i7) x += 1;
(%o7) 102
(%i8) x;
(%o8) 102
So it is certainly possible to do so, if you want to do that. But may I suggest maybe you don't need it? Modifying a variable makes it harder to keep track, mentally, what is going on. A programming policy such as one-time assignment can make it easier for the programmer to understand the program. This is part of a general approach called functional programming; perhaps you can take a look at that. Maxima has various features which make it possible to use functional programming, although you are not required to use them.

Rather x <= 1 or x < 2?

Say we have x which is an integer.
is there any reason to prefer x <= 1 or x < 2? I mean if one is maybe faster or more readable.
This question is language independant, meaning if the answer is different for two different languages, please let me know.
Usually when I loop over a zero-based collection, I use i < col.Length, as it is more readable than i <= col.Length - 1. If I am iterating from 1 to x, I use for (int i = 1; i <= x ...), as it is more readable than < x + 1. Both of theese instructions have the same time requirement (at least on x86 architecture), so yes, it is only about readability.
I would say that it depends on the requirements of your software, was it specified that x needs to be one or less or was it specified that x needs to be less than two?
If you ever changed x to be of a number type that allows decimal points, which way would work best then? This happens more often than you think and can introduce some interesting bugs.
In general there is no evidence what types the literals 1 and 2 have. In most languages they will be the same, but in theory they could be different and then the results of two comparisons could be different as well. Also, integers are not infinite in most languages, so the behavior could be different on boundaries.
In plain C, if the comparisons are x <= -0x80000000 and x < -0x7fffffff (note that -0x80000000 < -0x7fffffff) and x has type int, the results depend on the value of x:
-0x80000000 : 1 1
-0x7fffffff .. -1 : 0 0
0 .. 0x7fffffff: 1 0
In other words, for all non-negative x, the results will be different.
Similarly, with comparisons x <= 0x7fffffff and x < 0x80000000 (the relation between constants 0x7fffffff < 0x80000000 still holds), we get:
-0x80000000 .. -1 : 1 0
0 .. 0x7fffffff: 1 1
Now the results are different for all negative values of x.
Clearly, there are some typing rules and type conversions involved (they are described in the C language standard), but the point is that the two comparisons are not replaceable on boundary cases.

complexity calculation and logic concept

I'm trying to findthe complexity of the following code, and I don't know if I'm using my logic right, please correct me if I made a mistake
1)
For a = 1 to N
j = v
j = j / 2
k = i
While k >= 1
do some kind of processing
k = k / 2 // integer division
2)
For i = 1 to N
d = d / 2 // integer division
k = i
While k >= 1
k = k-1
This one should also be N * log N?
3)
For i = 1 to N functiontwo(x)
call functiontwo(i) if (x <= 0)
return some value
This one should be also n * log N, or am I wrong, because it is calling function two, and function two is log n?
Please let me know if I did the right way or give advice on figure out the loop logic better, thank you.
(Disclaimer: I haven't done these in a while, but since nobody else has jumped in yet, my two cents are hopefully better than nothing.)
I believe your logic is sound on #1. The i loop should be O(N), and the j and k loops appear to be O(logN), making the overall O(NlogN).
I question your conclusion on #2, though. Since k is decremented by one instead of divided, it seems to me that the k loop would be O(N), making for O(N^2) overall.
Hmmm...#3 is weird. I see why your first thought is O(NlogN). The division ordinarily would make it analagous to #1. Except... the first argument sent to functiontwo will be a positive value from the i loop. Since x > 0, it will then call functiontwo with half the original argument, which is still positive. Which will happen again, and again, etc. The mathematician in me starts to think that will never end. But I suppose one could argue that eventually you will reach the limit of the precision of your numeric data type and eventually have the result of x/2 be so close to zero that the computer counts it as zero. In that case, I imagine O(NlogN) would be accurate.
BTW, my answer for #3 is assuming that the x/2 is not integer division since you specified it for the others, but not for this one.

Is there a way to avoid creating an array in this Julia expression?

Is there a way to avoid creating an array in this Julia expression:
max((filter(n -> string(n) == reverse(string(n)), [x*y for x = 1:N, y = 1:N])))
and make it behave similar to this Python generator expression:
max(x*y for x in range(N+1) for y in range(x, N+1) if str(x*y) == str(x*y)[::-1])
Julia version is 2.3 times slower then Python due to array allocation and N*N iterations vs. Python's N*N/2.
EDIT
After playing a bit with a few implementations in Julia, the fastest loop style version I've got is:
function f(N) # 320ms for N=1000 Julia 0.2.0 i686-w64-mingw32
nMax = NaN
for x = 1:N, y = x:N
n = x*y
s = string(n)
s == reverse(s) || continue
nMax < n && (nMax = n)
end
nMax
end
but an improved functional version isn't far behind (only 14% slower or significantly faster, if you consider 2x larger domain):
function e(N) # 366ms for N=1000 Julia 0.2.0 i686-w64-mingw32
isPalindrome(n) = string(n) == reverse(string(n))
max(filter(isPalindrome, [x*y for x = 1:N, y = 1:N]))
end
There is 2.6x unexpected performance improvement by defining isPalindrome function, compared to original version on the top of this page.
We have talked about allowing the syntax
max(f(x) for x in itr)
as a shorthand for producing each of the values f(x) in one coroutine while computing the max in another coroutine. This would basically be shorthand for something like this:
max(#task for x in itr; produce(f(x)); end)
Note, however, that this syntax that explicitly creates a task already works, although it is somewhat less pretty than the above. Your problem can be expressed like this:
max(#task for x=1:N, y=x:N
string(x*y) == reverse(string(x*y)) && produce(x*y)
end)
With the hypothetical producer syntax above, it could be reduced to something like this:
max(x*y if string(x*y) == reverse(string(x*y) for x=1:N, y=x:N)
While I'm a fan of functional style, in this case I would probably just use a for loop:
m = 0
for x = 1:N, y = x:N
n = x*y
string(n) == reverse(string(n)) || continue
m < n && (m = n)
end
Personally, I don't find this version much harder to read and it will certainly be quite fast in Julia. In general, while functional style can be convenient and pretty, if your primary focus is on performance, then explicit for loops are your friend. Nevertheless, we should make sure that John's max/filter/product version works. The for loop version also makes other optimizations easier to add, like Harlan's suggestion of reversing the loop ordering and exiting on the first palindrome you find. There are also faster ways to check if a number is a palindrome in a given base than actually creating and comparing strings.
As to the general question of "getting flexible generators and list comprehensions in Julia", the language already has
A general high-performance iteration protocol based on the start/done/next functions.
Far more powerful multidimensional array comprehensions than most languages. At this point, the only missing feature is the if guard, which is complicated by the interaction with multidimensional comprehensions and the need to potentially dynamically grow the resulting array.
Coroutines (aka tasks) which allow, among other patterns, the producer-consumer pattern.
Python has the if guard but doesn't worry about comprehension performance nearly as much – if we're going to add that feature to Julia's comprehensions, we're going to do it in a way that's both fast and interacts well with multidimensional arrays, hence the delay.
Update: The max function is now called maximum (maximum is to max as sum is to +) and the generator syntax and/or filters work on master, so for example, you can do this:
julia> #time maximum(100x - x^2 for x = 1:100 if x % 3 == 0)
0.059185 seconds (31.16 k allocations: 1.307 MB)
2499
Once 0.5 is out, I'll update this answer more thoroughly.
There are two questions being mixed together here: (1) can you filter a list comprehension mid-comprehension (for which the answer is currently no) and (2) can you use a generator that doesn't allocate an array (for which the answer is partially yes). Generators are provided by the Iterators package, but the Iterators package seems to not play well with filter at the moment. In principle, the code below should work:
max((x, y) -> x * y,
filter((x, y) -> string(x * y) == reverse(string(x * y)),
product(1:N, 1:N)))
I don't think so. There aren't currently filters in Julia array comprehensions. See discussion in this issue.
In this particular case, I'd suggest just nested for loops if you want to get faster computation.
(There might be faster approaches where you start with N and count backwards, stopping as soon as you find something that succeeds. Figuring out how to do that correctly is left as an exercise, etc...)
As mentioned, this is now possible (using Julia 0.5.0)
isPalindrome(n::String) = n == reverse(n)
fun(N::Int) = maximum(x*y for x in 1:N for y in x:N if isPalindrome(string(x*y)))
I'm sure there are better ways that others can comment on. Time (after warm-up):
julia> #time fun(1000);
0.082785 seconds (2.03 M allocations: 108.109 MB, 27.35% gc time)

Repeated application of functions

Reading this question got me thinking: For a given function f, how can we know that a loop of this form:
while (x > 2)
x = f(x)
will stop for any value x? Is there some simple criterion?
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
Specifically, can we prove this for sqrt and for log?
For these functions, a proof that ceil(f(x))<x for x > 2 would suffice. You could do one iteration -- to arrive at an integer number, and then proceed by simple induction.
For the general case, probably the best idea is to use well-founded induction to prove this property. However, as Moron pointed out in the comments, this could be impossible in the general case and the right ordering is, in many cases, quite hard to find.
Edit, in reply to Amnon's comment:
If you wanted to use well-founded induction, you would have to define another strict order, that would be well-founded. In case of the functions you mentioned this is not hard: you can take x << y if and only if ceil(x) < ceil(y), where << is a symbol for this new order. This order is of course well-founded on numbers greater then 2, and both sqrt and log are decreasing with respect to it -- so you can apply well-founded induction.
Of course, in general case such an order is much more difficult to find. This is also related, in some way, to total correctness assertions in Hoare logic, where you need to guarantee similar obligations on each loop construct.
There's a general theorem for when then sequence of iterations will converge. (A convergent sequence may not stop in a finite number of steps, but it is getting closer to a target. You can get as close to the target as you like by going far enough out in the sequence.)
The sequence x, f(x), f(f(x)), ... will converge if f is a contraction mapping. That is, there exists a positive constant k < 1 such that for all x and y, |f(x) - f(y)| <= k |x-y|.
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
If we're talking about floats here, that's not true. If for all x > n f(x) is strictly less than x, it will reach n at some point (because there's only a limited number of floating point values between any two numbers).
Of course this means you need to prove that f(x) is actually less than x using floating point arithmetic (i.e. proving it is less than x mathematically does not suffice, because then f(x) = x may still be true with floats when the difference is not enough).
There is no general algorithm to determine whether a function f and a variable x will end or not in that loop. The Halting problem is reducible to that problem.
For sqrt and log, we could safely do that because we happen to know the mathematical properties of those functions. Say, sqrt approaches 1, log eventually goes negative. So the condition x < 2 has to be false at some point.
Hope that helps.
In the general case, all that can be said is that the loop will terminate when it encounters xi≤2. That doesn't mean that the sequence will converge, nor does it even mean that it is bounded below 2. It only means that the sequence contains a value that is not greater than 2.
That said, any sequence containing a subsequence that converges to a value strictly less than two will (eventually) halt. That is the case for the sequence xi+1 = sqrt(xi), since x converges to 1. In the case of yi+1 = log(yi), it will contain a value less than 2 before becoming undefined for elements of R (though it is well defined on the extended complex plane, C*, but I don't think it will, in general converge except at any stable points that may exist (i.e. where z = log(z)). Ultimately what this means is that you need to perform some upfront analysis on the sequence to better understand its behavior.
The standard test for convergence of a sequence xi to a point z is that give ε > 0, there is an n such that for all i > n, |xi - z| < ε.
As an aside, consider the Mandelbrot Set, M. The test for a particular point c in C for an element in M is whether the sequence zi+1 = zi2 + c is unbounded, which occurs whenever there is a |zi| > 2. Some elements of M may converge (such as 0), but many do not (such as -1).
Sure. For all positive numbers x, the following inequality holds:
log(x) <= x - 1
(this is a pretty basic result from real analysis; it suffices to observe that the second derivative of log is always negative for all positive x, so the function is concave down, and that x-1 is tangent to the function at x = 1). From this it follows essentially immediately that your while loop must terminate within the first ceil(x) - 2 steps -- though in actuality it terminates much, much faster than that.
A similar argument will establish your result for f(x) = sqrt(x); specifically, you can use the fact that:
sqrt(x) <= x/(2 sqrt(2)) + 1/sqrt(2)
for all positive x.
If you're asking whether this result holds for actual programs, instead of mathematically, the answer is a little bit more nuanced, but not much. Basically, many languages don't actually have hard accuracy requirements for the log function, so if your particular language implementation had an absolutely terrible math library this property might fail to hold. That said, it would need to be a really, really terrible library; this property will hold for any reasonable implementation of log.
I suggest reading this wikipedia entry which provides useful pointers. Without additional knowledge about f, nothing can be said.