proof L = {a^n b^m | n>=m} is irregular language - proof

I am stuck in finding S for pumping lemma. is there any idea to proof that
L = {a^n b^m | n>=m} is an irregular language?

The pumping lemma states this:
If L is a regular language, then there exists a natural number p such that any string w of length at least p can be written as w = uvx where |uv| <= p, |v| > 0 and for all natural numbers n, u(v^n)x is also in the language.
To prove a language is not regular using the pumping lemma, we need to design a string w such that the rest of the statement fails: that is, there are no valid assignments of u, v and x.
Our language L requires the number of a's to be the same as the number of b's. The shortest string that satisfies the hypothesis that the string w has length at least p is a^(p/2) b^(p/2). We could guess this as our string. If we do, we have a few cases:
v is entirely made of a's. But then, pumping is going to result in a different number of a's and b's, so the resulting string is not in the language; a condtradiction.
v spans a's and b's. But then, pumping is going to cause a's and b's to be mixed up in the middle, whereas our language requires all the a's to come first. This is also a contradiction.
v is entirely made of b's. But then, we have the same contradiction as in case #1.
In all cases, this choice of w led to a contradiction. That means the guess worked.
There was a simpler choice for w here: choose w = a^p b^p, then there is only one case. But our choice worked out fine. If our choice had not worked out, we could have learned from that choice what went wrong and chosen a different candidate.

For the previous comment,(1) doesn't make sense, since we can have more a's then b's. n>=m. I probably bombed a midterm yesterday due to this question, but found that the answer is actually in the pumping part.
The solution is that we can pump down as well as up. The pumping lemma for regular languages says that for all i>=0, w=x(y^i)z.
CASE 1: y = only a's
So by using a^n b^m with w = a^p b^p, if y is some amount of a's then we see:
x = a^p-l
y = a^l
z = b^m
Now if we use y^0, then there will be less a's than b's.
The next two cases should be easy to prove but I'll add them regardless.
CASE 2: y = only b's
x = a^p
y = b^l
z = b^(p-l)
Pumping to xy^2z leaves more b's than a's so that is not an accepted word in L.
CASE 3: y = a's and b's
x = a^(p-l)
y = (a^l)(b^k)
z = b^(p-k)
Pumping x(y^2)z gives a^(p-l) [(a^l)(b^k)(a^l)(b^k)] b^(p-k) which is not included in L.

Related

Pollard’s p−1 algorithm: understanding of Berkeley paper

This paper explains about Pollard's p-1 factorization algorithm. I am having trouble understanding the case when factor found is equal to the input we go back and change 'a' (basically page 2 point 2 in the aforementioned paper).
Why we go back and increment 'a'?
Why we not go ahead and keep incrementing the factorial? It it because we keep going into the same cycle we have already seen?
Can I get all the factors using this same algorithm? Such as 49000 = 2^3 * 5^3 * 7^2. Currently I only get 7 and 7000. Perhaps I can use this get_factor() function recursively but I am wondering about the base cases.
def gcd(a, b):
if not b:
return a
return gcd(b, a%b)
def get_factor(input):
a = 2
for factorial in range(2, input-1):
'''we are not calculating factorial as anyway we need to find
out the gcd with n so we do mod n and we also use previously
calculate factorial'''
a = a**factorial % input
factor = gcd(a - 1, input)
if factor == 1:
continue
elif factor == input:
a += 1
elif factor > 1:
return factor
n = 10001077
p = get_factor(n)
q = n/p
print("factors of", n, "are", p, "and", q)
The linked paper is not a particularly good description of Pollard's p − 1 algorithm; most descriptions discuss smoothness bounds that make the algorithm much more practical. You might like to read this page at Prime Wiki. To answer your specific questions:
Why increment a? Because the original a doesn't work. In practice, most implementations don't bother; instead, a different factoring method, such as the elliptic curve method, is tried instead.
Why not increment the factorial? This is where the smoothness bound comes into play. Read the page at Mersenne Wiki for more details.
Can I get all factors? This question doesn't apply to the paper you linked, which assumes that the number being factored is a semi-prime with exactly two factors. The more general answer is "maybe." This is what happens at Step 3a of the linked paper, and choosing a new a may work (or may not). Or you may want to move to a different factoring algorithm.
Here is my simple version of the p − 1 algorithm, using x instead of a. The while loop computes the magical L of the linked paper (it's the least common multiple of the integers less than the smoothness bound b), which is the same calculation as the factorial of the linked paper, but done in a different way.
def pminus1(n, b, x=2):
q = 0; pgen = primegen(); p = next(pgen)
while p < b:
x = pow(x, p**ilog(p,b), n)
q, p = p, next(pgen)
g = gcd(x-1, n)
if 1 < g < n: return g
return False
You can see it in action at http://ideone.com/eMPHtQ, where it factors 10001 as in the linked paper as well as finding a rather spectacular 36-digit factor of fibonacci(522). Once you master that algorithm, you might like to move on to the two-stage version of the algorithm.

Pumping Lemma for Regular Languages

I'm having some trouble with a rather difficult question. I'm being asked to prove the language {0^n 1^m 0^n | m,n >= 0} is irregular using the pumping lemma. In all the examples I've seen, the language is only being raised to the same variable (i.e. a^n b^n). So my question is, how do I pick a suitable string to test if this language is irregular?
Also a follow up to that question is once I have my string, how do you decompose the string into the form xyz where |xy| <= pumping length and |y| >=1?
In the examples you have seen before there were different letters: n as followed by bs. In the given example, the are n Os at the beginning and the end of the word. The language adds 0 or more 1s between those blocks of Os.
W in the pumping lemma is decomposed w = x y z with |xy| <= m and |y| > 0, where m is the pumping length. The way to pick a w is the same as before: you pick it such that the xy is completely inside a block consisting of one letter. For a^n b^n a word in L was selected such that xy would entirely consist of as, such that if it is 'pumped' there will be more as than bs. So you need at least m as and for the word to be in the language that means you need to pick m bs. The shortest is w = a^mb^m. For the new troublesome language, pick a word in this L such that xy consists entirely of Os (in the first block), such that if it is 'pumped' there will be more Os in the first block than the last block -and the number of 1s in the middle was not changed. However, you need to include at least one 1 in your original word otherwise there is only one block of Os - and pumped words in fact are in the language, which means there is no contradiction and thus not proof that L is irregular.

Checking bitmask: x & b != 0 VS x & b == b

Suppose x is a bitmask value, and b is one flag, e.g.
x = 0b10101101
b = 0b00000100
There seems to be two ways to check whether the bit indicated by b is turned on in x:
if (x & b != 0) // (1)
if (x & b == b) // (2)
In most circumstances it seems these two checks always yield the same result, given that b is always a binary with only one bit turned on.
However I wonder is there any exception that makes one method better than another?
In general, if we interpret both values as bit sets, the first condition checks if the intersection of x and b is not empty (or, to put it differently: if b and x have elements in common), while the second one checks if b is a subset of x.
Clearly, if b is a singleton, b is a subset of x if and only if the intersection is not empty.
So, whenever you cannot guarantee to 100% that b is a singleton, choose your condition wisely. Ask yourself if you want to express that all elements of b must also be elements of x, or that there are elements of b that are also elements of x. It's a huge difference except for the single bit case.

Repeated application of functions

Reading this question got me thinking: For a given function f, how can we know that a loop of this form:
while (x > 2)
x = f(x)
will stop for any value x? Is there some simple criterion?
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
Specifically, can we prove this for sqrt and for log?
For these functions, a proof that ceil(f(x))<x for x > 2 would suffice. You could do one iteration -- to arrive at an integer number, and then proceed by simple induction.
For the general case, probably the best idea is to use well-founded induction to prove this property. However, as Moron pointed out in the comments, this could be impossible in the general case and the right ordering is, in many cases, quite hard to find.
Edit, in reply to Amnon's comment:
If you wanted to use well-founded induction, you would have to define another strict order, that would be well-founded. In case of the functions you mentioned this is not hard: you can take x << y if and only if ceil(x) < ceil(y), where << is a symbol for this new order. This order is of course well-founded on numbers greater then 2, and both sqrt and log are decreasing with respect to it -- so you can apply well-founded induction.
Of course, in general case such an order is much more difficult to find. This is also related, in some way, to total correctness assertions in Hoare logic, where you need to guarantee similar obligations on each loop construct.
There's a general theorem for when then sequence of iterations will converge. (A convergent sequence may not stop in a finite number of steps, but it is getting closer to a target. You can get as close to the target as you like by going far enough out in the sequence.)
The sequence x, f(x), f(f(x)), ... will converge if f is a contraction mapping. That is, there exists a positive constant k < 1 such that for all x and y, |f(x) - f(y)| <= k |x-y|.
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
If we're talking about floats here, that's not true. If for all x > n f(x) is strictly less than x, it will reach n at some point (because there's only a limited number of floating point values between any two numbers).
Of course this means you need to prove that f(x) is actually less than x using floating point arithmetic (i.e. proving it is less than x mathematically does not suffice, because then f(x) = x may still be true with floats when the difference is not enough).
There is no general algorithm to determine whether a function f and a variable x will end or not in that loop. The Halting problem is reducible to that problem.
For sqrt and log, we could safely do that because we happen to know the mathematical properties of those functions. Say, sqrt approaches 1, log eventually goes negative. So the condition x < 2 has to be false at some point.
Hope that helps.
In the general case, all that can be said is that the loop will terminate when it encounters xi≤2. That doesn't mean that the sequence will converge, nor does it even mean that it is bounded below 2. It only means that the sequence contains a value that is not greater than 2.
That said, any sequence containing a subsequence that converges to a value strictly less than two will (eventually) halt. That is the case for the sequence xi+1 = sqrt(xi), since x converges to 1. In the case of yi+1 = log(yi), it will contain a value less than 2 before becoming undefined for elements of R (though it is well defined on the extended complex plane, C*, but I don't think it will, in general converge except at any stable points that may exist (i.e. where z = log(z)). Ultimately what this means is that you need to perform some upfront analysis on the sequence to better understand its behavior.
The standard test for convergence of a sequence xi to a point z is that give ε > 0, there is an n such that for all i > n, |xi - z| < ε.
As an aside, consider the Mandelbrot Set, M. The test for a particular point c in C for an element in M is whether the sequence zi+1 = zi2 + c is unbounded, which occurs whenever there is a |zi| > 2. Some elements of M may converge (such as 0), but many do not (such as -1).
Sure. For all positive numbers x, the following inequality holds:
log(x) <= x - 1
(this is a pretty basic result from real analysis; it suffices to observe that the second derivative of log is always negative for all positive x, so the function is concave down, and that x-1 is tangent to the function at x = 1). From this it follows essentially immediately that your while loop must terminate within the first ceil(x) - 2 steps -- though in actuality it terminates much, much faster than that.
A similar argument will establish your result for f(x) = sqrt(x); specifically, you can use the fact that:
sqrt(x) <= x/(2 sqrt(2)) + 1/sqrt(2)
for all positive x.
If you're asking whether this result holds for actual programs, instead of mathematically, the answer is a little bit more nuanced, but not much. Basically, many languages don't actually have hard accuracy requirements for the log function, so if your particular language implementation had an absolutely terrible math library this property might fail to hold. That said, it would need to be a really, really terrible library; this property will hold for any reasonable implementation of log.
I suggest reading this wikipedia entry which provides useful pointers. Without additional knowledge about f, nothing can be said.

Normal vector from least squares-derived plane

I have a set of points and I can derive a least squares solution in the form:
z = Ax + By + C
The coefficients I compute are correct, but how would I get the vector normal to the plane in an equation of this form? Simply using A, B and C coefficients from this equation don't seem correct as a normal vector using my test dataset.
Following on from dmckee's answer:
a x b = (a2b3 − a3b2), (a3b1 − a1b3), (a1b2 − a2b1)
In your case a1=1, a2=0 a3=A b1=0 b2=1 b3=B
so = (-A), (-B), (1)
Form the two vectors
v1 = <1 0 A>
v2 = <0 1 B>
both of which lie in the plane and take the cross-product:
N = v1 x v2 = <-A, -B, +1> (or v2 x v1 = <A, B, -1> )
It works because the cross-product of two vectors is always perpendicular to both of the inputs. So using two (non-colinear) vectors in the plane gives you a normal.
NB: You probably want a normalized normal, of course, but I'll leave that as an exercise.
A little extra color on the dmckee answer. I'd comment directly, but I do not have enough SO rep yet. ;-(
The plane z = Ax + By + C only contains the points (1, 0, A) and (0, 1, B) when C=0. So, we would be talking about the plane z = Ax + By. Which is fine, of course, since this second plane is parallel to the original one, the unique vertical translation that contains the origin. The orthogonal vector we wish to compute is invariant under translations like this, so no harm done.
Granted, dmckee's phrasing is that his specified "vectors" lie in the plane, not the points, so he's arguably covered. But it strikes me as helpful to explicitly acknowledge the implied translations.
Boy, it's been a while for me on this stuff, too.
Pedantically yours... ;-)