Context-free language not closed under intersection - intersection

I found this solution on Wikipedia. Shouldn't it say: j>n≥ 0
Because the intersection are elements that are common in both languages.
Consider the languages L1 and L2 defined by L1={a^(n)b^(n)c^(j)| n,j ≥ 0} and L2 = {a^(j)b^(n)c^(n): n,j ≥ 0}. They are both context-free.
However, their intersection is the language L = {a^(n)b^(n)c^(n)| n ≥ 0}.

No it should not. There is no relation between j and n. In L1 the only condition is equal number of a's and b's. Whether the number of c's is more or less is immaterial. Similarly in L2, it is equal number of b's and c's. Whether the number of a's is more or less (than the number of b's and c's) is not important. However the intersection will have those strings which fall in both L1 and L2, i.e. (equal number of a's and b's) AND (equal number of b's and c's) which implies equal number of a's, b's and c's.

Related

Explanation on determining if a decimal number has a finite representation in a base

When trying to find the answer I came across this and was wondering if this is true and why it is.
https://stackoverflow.com/a/489870/5712298
If anyone can explain it to me or link me to a page explaining it that would be great.
Stackoverflow markup does not support mathematical notation well, and most readers of this will be programmers, so I am going to use common programming expression syntax:
* multiplication
^ exponentiation
/ division
x[i] Element i of an array x
== equality
PROD product
This deals with the question of whether, given a radix r terminating fraction a/(r^n), there is a terminating radix s fraction b/(s^m) with exactly the same value, a, b integers, r and s positive integers, n and m non-negative integers.
a/(r^n)==b/(s^m) is equivalent to b==a*(s^m)/(r^n). a/(r^n) is exactly equal to some radix s terminating fraction if, and only if, there exists a positive integer m such that a*(s^m)/(r^n) is an integer.
Consider the prime factorization of r, PROD(p[i]^k[i]). If, for some i, p[i]^k[i] is a term in the prime factorization of r, then p[i]^(n*k[i]) is a term in the prime factorization of r^n.
a*(s^m)/(r^n) is an integer if, and only if, every p[i]^(n*k[i]) in the prime factorization of r^n is also a factor of a*(s^m)
First suppose p[i] is also a factor of s. Then for sufficiently large m, p[i]^(n*k[i]) is a factor of s^m.
Now suppose p[i] is not a factor of s. p[i]^(n*k[i]) is a factor of a*(s^m) if, and only if, it is a factor of a.
The necessary and sufficient condition for the existence of a non-negative integer m such that b==a*(s^m)/(r^n) is an integer is that, for each p[i]^k[i] in the prime factorization of r, either p[i] is a factor of s or p[i]^(n*k[i]) is a factor of a.
Applying this to the common case of r=10 and s=2, the prime factorization of r is (2^1)*(5^1). 2 is a factor of 2, so we can ignore it. 5 is not, so we need 5^n to be a factor of a.
Consider some specific cases:
Decimal 0.1 is 1/10, 5 is not factor of 1, so there is no exact binary fraction equivalent.
Decimal 0.625, 625/(10^3). 5^3 is 125, which is a factor of 625, so there is an exact binary fraction equivalent. (It is binary 0.101).
The method in the referenced answer https://stackoverflow.com/a/489870/5712298 is equivalent to this for decimal to binary. It would need some work to extend to the general case, to allow for prime factors whose exponent is not 1.

How to map number in a range to another in the same range with no collisions?

Effectively what I'm looking for is a function f(x) that outputs into a range that is pre-defined. Calling f(f(x)) should be valid as well. The function should be cyclical, so calling f(f(...(x))) where the number of calls is equal to the size of the range should give you the original number, and f(x) should not be time dependent and will always give the same output.
While I can see that taking a list of all possible values and shuffling it would give me something close to what I want, I'd much prefer it if I could simply plug values into the function one at a time so that I do not have to compute the entire range all at once.
I've looked into Minimal Perfect Hash Functions but haven't been able to find one that doesn't use external libraries. I'm okay with using them, but would prefer to not do so.
If an actual range is necessary to help answer my question, I don't think it would need to be bigger than [0, 2^24-1], but the starting and ending values don't matter too much.
You might want to take a look at Linear Congruential Generator. You shall be looking at full period generator (say, m=224), which means parameters shall satisfy Hull-Dobell Theorem.
Calling f(f(x)) should be valid as well.
should work
the number of calls is equal to the size of the range should give you the original number
yes, for LCG with parameters satisfying Hull-Dobell Theorem you'll get full period covered once, and 'm+1' call shall put you back at where you started.
Period of such LCG is exactly equal to m
should not be time dependent and will always give the same output
LCG is O(1) algorithm and it is 100% reproducible
LCG is reversible as well, via extended Euclid algorithm, check Reversible pseudo-random sequence generator for details
Minimal perfect hash functions are overkill, all you've asked for is a function f that is,
bijective, and
"cyclical" (ie fN=f)
For a permutation to be cyclical in that way, its order must divide N (or be N but in a way that's just a special case of dividing N). Which in turn means the LCM of the orders of the sub-cycles must divide N. One way to do that is to just have one "sub"-cycle of order N. For power of two N, it's also really easy to have lots of small cycles of some other power-of-two order. General permutations do not necessarily satisfy the cycle-requirement, of course they are bijective but the LCM of the orders of the sub-cycles may exceed N.
In the following I will leave all reduction modulo N implicit. Without loss of generality I will assume the range starts at 0 and goes up to N-1, where N is the size of the range.
The only thing I can immediately think of for general N is f(x) = x + c where gcd(c, N) == 1. The GCD condition ensures there is only one cycle, which necessarily has order N.
For power-of-two N I have more inspiration:
f(x) = cx where c is odd. Bijective because gcd(c, N) == 1 so c has a modular multiplicative inverse. Also cN=1, because φ(N)=N/2 (since N is a power of two) so cφ(N)=1 (Euler's theorem).
f(x) = x XOR c where c < N. Trivially bijective and trivially cycles with a period of 2, which divides N.
f(x) = clmul(x, c) where c is odd and clmul is carry-less multiplication. Bijective because any odd c has a carry-less multiplicative inverse. Has some power-of-two cycle length (less than N) so it divides N. I don't know why though. This is a weird one, but it has decent special cases such as x ^ (x << k). By symmetry, the "mirrored" version also works.
Eg x ^ (x >> k).
f(x) = x >>> k where >>> is bit-rotation. Obviously bijective, and fN(x) = x >>> Nk, where Nk mod N = 0 so it rotates all the way back to the unrotated position regardless of what k is.

Pumping Lemma for Regular Languages

I'm having some trouble with a rather difficult question. I'm being asked to prove the language {0^n 1^m 0^n | m,n >= 0} is irregular using the pumping lemma. In all the examples I've seen, the language is only being raised to the same variable (i.e. a^n b^n). So my question is, how do I pick a suitable string to test if this language is irregular?
Also a follow up to that question is once I have my string, how do you decompose the string into the form xyz where |xy| <= pumping length and |y| >=1?
In the examples you have seen before there were different letters: n as followed by bs. In the given example, the are n Os at the beginning and the end of the word. The language adds 0 or more 1s between those blocks of Os.
W in the pumping lemma is decomposed w = x y z with |xy| <= m and |y| > 0, where m is the pumping length. The way to pick a w is the same as before: you pick it such that the xy is completely inside a block consisting of one letter. For a^n b^n a word in L was selected such that xy would entirely consist of as, such that if it is 'pumped' there will be more as than bs. So you need at least m as and for the word to be in the language that means you need to pick m bs. The shortest is w = a^mb^m. For the new troublesome language, pick a word in this L such that xy consists entirely of Os (in the first block), such that if it is 'pumped' there will be more Os in the first block than the last block -and the number of 1s in the middle was not changed. However, you need to include at least one 1 in your original word otherwise there is only one block of Os - and pumped words in fact are in the language, which means there is no contradiction and thus not proof that L is irregular.

Three-way xor-like function

I'm trying to solve the following puzzle:
Given a stream of numbers (only 1 iteration over them is allowed) in which all numbers appear 3 times, but 1 number appear only 2 times, find this number, using O(1) memory.
I started with the idea that, if all numbers appeared 2 times, and 1 number only once, I could use xor operation between all numbers and the result would be the incognito number.
So I want to extend this idea to solve the puzzle. All I need is a xor-like function (or operator), which would yield 0 on the third apply:
SEED xor3 X xor3 X xor3 X = SEED
X xor3 Y xor3 SEED xor3 X xor3 Y xor3 Y xor3 X = SEED
Any ideas for such a function?
Regard XOR as summation on each bit of a number expressed in binary (i.e. a radix of 2), modulo 2.
Now consider a numerical system consisting of tribits 0, 1, and 2. That is, it has a radix of 3.
The operator T now becomes an operation on any number, decomposed into this radix. As in XOR, you sum the bits, but the difference is that operator T is ran in modulo 3.
You can easily show that a T a T a is zero for any a. You can also show that T is both commutative and associative. That is necessary since, in general, your sequence will have the numbers jumbled up.
Now apply this to your list of numbers. At the end of the operation, the output will be b where b = o T o and o is the number that occurs exactly twice.
Your solution for the simpler case (all number appear twice, one number appears once) works since xor operates on each bit x as
x xor x = 0 and 0 xor x = x
xor is basically a bit-wise summation modulus 2. You would need the base-3 equivalent: Transform each number into a base-3 representation. And then use summation modulus 3 for each decimal:
0 1 2
0 0 1 2
1 1 2 0
2 2 0 1
Call this operation xor3. Now you have for each decimal x:
x xor3 x xor3 x = 0 and 0 xor3 x = x
If you apply that to all your numbers then all values that appear 3 times will vanish. The result is x xor3 x of the number x that appears twice. You need to apply decimal-wise division by 2 modulus 3.
I believe there are more efficient ways to implement that. The advantage of the xor function in the first case relies on the fact that xor is a natural base-2 operation. Is there any practical application for that?
This approach is a bit fragile: If the precondition (all numbers appear 3 times except one that appears twice) breaks the algorithm will not help you.
Take a Map with int-keys and int-values. Then walk through your numbers and for each number x increase each the according value. If x is a new key take 0 as start value.
Then you can analyze it easily: Walk through all keys and check the cardinality. It should be three for all keys except one that should be two. This is more robust and my gut feeling says it is also faster.

How do you calculate the total number of all possible unique subsets from a set with repeats?

Given a set** S containing duplicate elements, how can one determine the total number all the possible subsets of S, where each subset is unique.
For example, say S = {A, B, B} and let K be the set of all subsets, then K = {{}, {A}, {B}, {A, B}, {B, B}, {A, B, B}} and therefore |K| = 6.
Another example would be if S = {A, A, B, B}, then K = {{}, {A}, {B}, {A, B}, {A, A}, {B, B}, {A, B, B}, {A, A, B}, {A, A, B, B}} and therefor |K| = 9
It is easy to see that if S is a real set, having only unique elements, then |K| = 2^|S|.
What is a formula to calculate this value |K| given a "set" S (with duplicates), without generating all the subsets?
** Not technically a set.
Take the product of all the (frequencies + 1).
For example, in {A,B,B}, the answer is (1+1) [the number of As] * (2+1) [the number of Bs] = 6.
In the second example, count(A) = 2 and count(B) = 2. Thus the answer is (2+1) * (2+1) = 9.
The reason this works is that you can define any subset as a vector of counts - for {A,B,B}, the subsets can be described as {A=0,B=0}, {A=0,B=1}, {0,2}, {1,0}, {1,1}, {1,2}.
For each number in counts[] there are (frequencies of that object + 1) possible values. (0..frequencies)
Therefore, the total number of possiblities is the product of all (frequencies+1).
The "all unique" case can also be explained this way - there is one occurence of each object, so the answer is (1+1)^|S| = 2^|S|.
I'll argue that this problem is simple to solve, when viewed in the proper way. You don't care about order of the elements, only whether they appear in a subset of not.
Count the number of times each element appears in the set. For the one element set {A}, how many subsets are there? Clearly there are only two sets. Now suppose we added another element, B, that is distinct from A, to form the set {A,B}. We can form the list of all sets very easily. Take all the sets that we formed using only A, and add in zero or one copy of B. In effect, we double the number of sets. Clearly we can use induction to show that for N distinct elements, the total number of sets is just 2^N.
Suppose that some elements appear multiple times? Consider the set with three copies of A. Thus {A,A,A}. How many subsets can you form? Again, this is simple. We can have 0, 1, 2, or 3 copies of A, so the total number of subsets is 4 since order does not matter.
In general, for N copies of the element A, we will end up with N+1 possible subsets. Now, expand this by adding in some number, M, of copies of B. So we have N copies of A and M copies of B. How many total subsets are there? Yes, this seems clear too. To every possible subset with only A in it (there were N+1 of them) we can add between 0 and M copies of B.
So the total number of subsets when we have N copies of A and M copies of B is simple. It must be (N+1)*(M+1). Again, we can use an inductive argument to show that the total number of subsets is the product of such terms. Merely count up the total number of replicates for each distinct element, add 1, and take the product.
See what happens with the set {A,B,B}. We get 2*3 = 6.
For the set {A,A,B,B}, we get 3*3 = 9.