Need to prove language L = {a^nb^m: n < m < 2m} is not regular - proof

I don't understand the pumping lemma very well, and could use a simple break down of how to prove something like this.

Assume that the language L = {a^n b^m: n < m < 2m} is regular. Then, by the pumping lemma, every string in L of length at least p can be written as xyz where |xy| < p, |y| > 0 and for every natural number k, x(y^k)z is also a string in L. Consider the string a^p b^(p+1). This string has length at least p and is in L. Now we consider options for the substring y:
y consists only of a's. Then, we can choose k > 1 to increase the number of a's to be greater than the number of b's, getting a string not in L.
y consists of both a's and b's. Then, for k > 1, pumping will cause some a's to come after some b's, resulting in a string that can't possibly be in L.
y consists only of b's. Then, we can choose k = p so there are at least 2p + 1 b's, giving more than twice the number of b's as there are a's, and therefore a string not in L.
Because these three ways are the only ways to choose the substring y, there is no way to choose y so that the conditions of the pumping lemma are satisfied. This is a contradiction. Therefore, the assumption that the language is regular must be incorrect. It follows that the language is not regular. The proof was by contradiction / reduction ad absurdum.

Related

complexity class of functions

What would these statements mean if f(n) and g(n) are functions over natural numbers?
g(n) is in Θ(f(n)).
and
An algorithm is in the complexity class Θ(f(n)).
g(n) is bracketed by a.f(n) and b.f(n) (where a<b are two constants) for sufficiently large n.
The running time of the algorithm is of order Θ(f(n)).
If g(n) is in Θ(f(n))) then there exist real constants a and b greater than zero and an integer constant n0 greater than zero such that for all n > n0, a * f(n) <= g(n) <= b * f(n). For instance, the function g(n) = (1 + sin(n)) * n^2 is O(n^2) because we can choose a = 1, b = 2 and n0 = 1 and verify the inequalities hold.
If an algorithm is in the complexity class Θ(f(n)), then the relation G which maps inputs of size n to numbers of operations m taken to process that input on some implied computer has the following properties:
Both min({m | G(n, m)}) and max({m | G(n, m)}) are defined.
The function g(n) = min({m | G(n, m)}) is in Θ(f(n)).
The function h(n) = max({m | G(n, m)}) is in Θ(f(n)).
For a formal definition of Big Θ notation have a look at Wikipedia.
A more descriptive explanation you can find here.
For an algorithm n is the size of the input data (for example the number of elements in a list/array/…) and you will get an estimate of the „cost“ of your algorithm for increasing n.
Cost typically means number of elementary operations (arithmetic operations, comparisons, conditions).
Simple example for an „algorithm“: accessing an arbitrary element in a container of size n:
for an array you can access each element directly. So cost is constant, meaning f(n) = 1 in your notation
for a linked list you have iterate through the list from begin. So on average you have to pass half of the list: f(n) = n/2 (or equivalently f(n) = n).

Generating reversible permutations over a set

I want to traverse all the elements in the set Q = [0, 2^16) in a non sequential manner. To do so I need a function f(x) Q --> Q which gives the order in which the set will be sorted. for example:
f(0) = 2345
f(1) = 4364
f(2) = 24
(...)
To recover the order I would need the inverse function f'(x) Q --> Q which would output:
f(2345) = 0
f(4364) = 1
f(24) = 2
(...)
The function must be bijective, for each element of Q the function uniquely maps to another element of Q.
How can I generate such a function or are there any know functions that do this?
EDIT: In the following answer, f(x) is "what comes after x", not "what goes in position x". For example, if your first number is 5, then f(5) is the next element, not f(1). In retrospect, you probably thought of f(x) as "what goes in position x". The function defined in this answer is much weaker if used as "what goes in position x".
Linear congruential generators fit your needs.
A linear congruential generator is defined by the equation
f(x) = a*x+c (mod m)
for some constants a, c, and m. In this case, m = 65536.
An LCG has full period (the property you want) if the following properties hold:
c and m are relatively prime.
a-1 is divisible by all prime factors of m.
If m is a multiple of 4, a-1 is a multiple of 4.
We'll go with a = 5, c = 1.
To invert an LCG, we solve for f(x) in terms of x:
x = (a^-1)*(f(x) - c) (mod m)
We can find the inverse of 5 mod 65536 by the extended Euclidean algorithm, or since we just need this one computation, we can plug it into Wolfram Alpha. The result is 52429.
Thus, we have
f(x) = (5*x + 1) % 65536
f^-1(x) = (52429 * (x - 1)) % 65536
There's many approaches to solving this.
Since your set size is small, the requirement for generating the function and its inverse can simply be done via memory lookup. So once you choose your permutation, you can store the forward and reverse directions in lookup tables.
One approach to creating a permutation is mapping out all elements in an array and then randomly swapping them "enough" times. C code:
int f[PERM_SIZE], inv_f[PERM_SIZE];
int i;
// start out with identity permutation
for (i=0; i < PERM_SIZE; ++i) {
f[i] = i;
inv_f[i] = i;
}
// seed your random number generator
srand(SEED);
// look "enough" times, where we choose "enough" = size of array
for (i=0; i < PERM_SIZE; ++i) {
int j, k;
j = rand()%PERM_SIZE;
k = rand()%PERM_SIZE;
swap( &f[i], &f[j] );
}
// create inverse of f
for (i=0; i < PERM_SIZE; ++i)
inv_f[f[i]] = i;
Enjoy

Bitwise comparison for 16 bitstrings

I have 16 unrelated binary strings (of the same length). eg. 100000001010, 010100010010 and so on, and I need to find out a bitstring in which position x is a 1 IF position x is 1 for ATLEAST 2 bitstrings out of the 16.
Initially, I tries using bitwise XOR and this works great as long as even number of strings contain a 1, but when odd number of strings contain 1, the answer given is reverse.
A simple example (with 3 strings) would be:
A: 10101010
B: 01010111
C: 11011011
f(A,B,C)= answer
Expected answer: 11011011
Answer I'm getting right now: 11011001
I know I'm wrong somewhere but I'm at a loss on how to proceed
Help much appreciated
You can do something like
unsigned once = x[0], twice = 0;
for (int i = 1; i < 16; ++i) {
twice |= once & x[i];
once |= x[i];
}
(A AND B) OR (A AND C) OR (B AND C)
This is higher complexity than what you had originally.

Addition as binary operations

I'm adding a pair of unsigned 32bit binary integers (including overflow). The addition is expressive rather than actually computed, so there's no need for an efficient algorithm, but since each component is manually specified in terms of individual bits, I need one with a compact representation. Any suggestions?
Edit: In terms of boolean operators. So I'm thinking that carry = a & b; sum = a ^ b; for the first bit, but the other 31?
Oh, and subtraction!
You can not perform addition with simple boolean operators, you need an adder. (Of course the adder can be built using some more complex boolean operators.)
The adder adds two bits plus carry, and passes carry out to next bit.
Pseudocode:
carry = 0
for i = 31 to 0
sum = a[i] + b[i] + carry
result[i] = sum & 1
carry = sum >> 1
next i
Here is an implementation using the macro language of VEDIT text editor.
The two numbers to be added are given as ASCII strings, one on each line.
The results are inserted on the third line.
Reg_Empty(10) // result as ASCII string
#0 = 0 // carry bit
for (#9=31; #9>=0; #9--) {
#1 = CC(#9)-'0' // a bit from first number
#2 = CC(#9+34)-'0' // a bit from second number
#3 = #0+#1+#2 // add with carry
#4 = #3 & 1 // resulting bit
#0 = #3 >> 1 // new carry
Num_Str(#4, 11, LEFT) // convert bit to ASCII
Reg_Set(10, #11, INSERT) // insert bit to start of string
}
Line(2)
Reg_Ins(10) IN
Return
Example input and output:
00010011011111110101000111100001
00110110111010101100101101110111
01001010011010100001110101011000
Edit:
Here is pseudocode where the adder has been implemented with boolean operations:
carry = 0
for i = 31 to 0
sum[i] = a[i] ^ b[i] ^ carry
carry = (a[i] & b[i]) | (a[i] & carry) | (b[i] & carry)
next i
Perhaps you can begin by stating addition for two 1-bit numbers, with overflow (=carry):
A | B | SUM | CARRY
===================
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
To generalize this further, you need a "full adder" which also takes a carry as an input, from the preceding stage. Then you can express the 32-bit addition as a chain of 32 such full adders (with the first stage's carry input tied to 0).
Regarding data structure part to represent these numbers. There are 4 ways
1) Bit Array
A bit array is an array data structure that compactly stores individual bits.
They are also known as bitmap, bitset or bitstring.
2) Bit Field
A bit field is a common idiom used in computer programming to compactly store multiple logical values as a short series of bits where each of the single bits can be addressed separately.
3) Bit Plane
A bit plane of a digital discrete signal (such as image or sound) is a set of bits corresponding to a given bit position in each of the binary numbers representing the signal.
4) Bit Board
A bitboard or bit field is a format that stuffs a whole group of related boolean variables into the same integer, typically representing positions on a board game.
Regarding implementation, you can check that at each step, we have following
S = a xor b xor c
S is result of sum of current bits a an b
c is input carry
Cout - output carry is (a & b) xor (c & (a xor b))

Complexity of a given function

When I analyzed the complexity of the code segment below, I found that it is O(n/2). But while searching the internet I discovered that it is probably O(n). I'd like to know who's correct.
void function(int n) {
int i = 1, k = 100;
while (i < n) {
k++;
i += 2;
}
}
What is the point of the variable k in the above method? Regardless big-O notation talks about the behavior in the limit (as the value of n approaches infinity). As such, big-O notation is agnostic to BOTH scaling factors and constants. Which is to say, for any constant "c" and scaling factor "s"
O(f(n)) is equivalent to O(s*f(n) + c)
In your case f(n) = n, s = 1/2, and c = 0. So...
O(n) = O(n/2)
O(n) is the same as O(n/2)
The idea of big-O notation is to understand how fast an algorithm will run as you give it a larger input. So, for example, if you double the size of your input, will the program take twice as long or will it take 4 times as long.
Since both n and n/2 behave identically as you vary the value of N (i.e. if you increase N by a factor of 10, both N itself and N/2 scale identically).
O(n/2) = O(0.5n) = O(n). See Wikipedia for more on this.
If f is O(g), then there exist some c and n such that for all x > n, |f(x)| <= c * |g(x)|. That is, from input n onwards, c * g(x) dominates f(x).
It follows that O(n/2) = O(n), because,
If f(x) = x/2 and g(x) = x, then we set c = 0.5 and n = 0.
If f(x) = x and g(x) = x/2, then we set c = 2 and n = 0.
Note that there are infinitely many values for c and n that you can use to prove this. (In the above I minimized them, but that is not necessary.)