Generating reversible permutations over a set - function

I want to traverse all the elements in the set Q = [0, 2^16) in a non sequential manner. To do so I need a function f(x) Q --> Q which gives the order in which the set will be sorted. for example:
f(0) = 2345
f(1) = 4364
f(2) = 24
(...)
To recover the order I would need the inverse function f'(x) Q --> Q which would output:
f(2345) = 0
f(4364) = 1
f(24) = 2
(...)
The function must be bijective, for each element of Q the function uniquely maps to another element of Q.
How can I generate such a function or are there any know functions that do this?

EDIT: In the following answer, f(x) is "what comes after x", not "what goes in position x". For example, if your first number is 5, then f(5) is the next element, not f(1). In retrospect, you probably thought of f(x) as "what goes in position x". The function defined in this answer is much weaker if used as "what goes in position x".
Linear congruential generators fit your needs.
A linear congruential generator is defined by the equation
f(x) = a*x+c (mod m)
for some constants a, c, and m. In this case, m = 65536.
An LCG has full period (the property you want) if the following properties hold:
c and m are relatively prime.
a-1 is divisible by all prime factors of m.
If m is a multiple of 4, a-1 is a multiple of 4.
We'll go with a = 5, c = 1.
To invert an LCG, we solve for f(x) in terms of x:
x = (a^-1)*(f(x) - c) (mod m)
We can find the inverse of 5 mod 65536 by the extended Euclidean algorithm, or since we just need this one computation, we can plug it into Wolfram Alpha. The result is 52429.
Thus, we have
f(x) = (5*x + 1) % 65536
f^-1(x) = (52429 * (x - 1)) % 65536

There's many approaches to solving this.
Since your set size is small, the requirement for generating the function and its inverse can simply be done via memory lookup. So once you choose your permutation, you can store the forward and reverse directions in lookup tables.
One approach to creating a permutation is mapping out all elements in an array and then randomly swapping them "enough" times. C code:
int f[PERM_SIZE], inv_f[PERM_SIZE];
int i;
// start out with identity permutation
for (i=0; i < PERM_SIZE; ++i) {
f[i] = i;
inv_f[i] = i;
}
// seed your random number generator
srand(SEED);
// look "enough" times, where we choose "enough" = size of array
for (i=0; i < PERM_SIZE; ++i) {
int j, k;
j = rand()%PERM_SIZE;
k = rand()%PERM_SIZE;
swap( &f[i], &f[j] );
}
// create inverse of f
for (i=0; i < PERM_SIZE; ++i)
inv_f[f[i]] = i;
Enjoy

Related

how to create multivariate function handle in matlab in this case?

I would like to create a multivariate functional handle which the number of variables is changeable according to the input.
First, create n symbolic variables, and note that n can be changed according to your input.
n=3;
syms theta [1 n];
Now I create a function g. Via For loop, create the summation of g on all theta. As seen in the code, f is a symbolic expression.
g = #(x)(x^2);
f = 0;
for i = 1:n
f = f + g(sym(sprintfc('theta%d',i)))
end
Now I want to create a functional handle F according to f.
One potential way to do this F = #(theta1,theta2,theta3)(f). However, since n is user-specified, changeable variable, this approach is not doable.
Could someone give my hint? Many thanks!
Is this what you are looking for?
g = #(x)x.^2
fn = #(varargin) sum( cellfun(g,varargin) )
Now we have an anonymous function with a variable number of inputs. Example use below
fn(1) % = 1
fn(1,5,3) % = 35 = (1^2+5^2+3^2)
fn(1,2,3,4,5,6) % = 91 = (1^2 + 2^2 + 3^2 + 4^2 + 5^2 + 6^2)

scilab - how to return matrices from a function with if-statements?

I have a scilab function that looks something like this (very simplified code just to get the concept of how it works):
function [A, S, Q]=myfunc(a)
A = a^2;
S = a+a+a;
if S > A then
Q = "Bigger";
else
Q = "Lower";
end
endfunction
And I get the expected result if I run:
--> [A,S,Q]=myfunc(2)
Q =
Bigger
S =
6.
A =
4.
But if I put matrices into the function I expect to get equivalent matrices back as an answer with a result but instead I got this:
--> [A,S,Q]=myfunc([2 4 6 8])
Q =
Lower
S =
6. 12. 18. 24.
A =
4. 16. 36. 64.
Why isn't Q returning matrices of values like S and A? And how do I achieve that it will return "Bigger. Lower. Lower. Lower." as an answer? That is, I want to perform the operation on each element of the matrix.
Because in your program you wrote Q = "Bigger" and Q = "Lower". That means that Q will only have one value. If you want to store the comparisons for every value in A and S, you have to make Scilab do that.
You can achieve such behavior by using loops. This is how you can do it by using two for loops:
function [A, S, Q]=myfunc(a)
A = a^2;
S = a+a+a;
//Get the size of input a
[nrows, ncols] = size(a)
//Traverse all rows of the input
for i = 1 : nrows
//Traverse all columns of the input
for j = 1 : ncols
//Compare each element
if S(i,j) > A(i,j) then
//Store each result
Q(i,j) = "Bigger"
else
Q(i,j) = "Lower"
end
end
end
endfunction
Beware of A = a^2. It can break your function. It has different behaviors if input a is a vector (1-by-n or n-by-1 matrix), rectangle matrix (m-by-n matrix, m ≠ n ), or square matrix (n-by-n matrix):
Vector: it works like .^, i.e. it raises each element individually (see Scilab help).
Rectangle: it won't work because it has to follow the rule of matrix multiplication.
Square: it works and follows the rule of matrix multiplication.
I will add that in Scilab, the fewer the number of loop, the better : so #luispauloml answer may rewrite to
function [A, S, Q]=myfunc(a)
A = a.^2; // used element wise power, see luispauloml advice
S = a+a+a;
Q(S > A) = "Bigger"
Q(S <= A) = "Lower"
Q = matrix(Q,size(a,1),size(a,2)) // a-like shape
endfunction

Free-Pascal Implementation of the Sieve of Eratosthenes

My teacher gave me an assignment like this:
Using the number n given, find the largest prime number p with p<=n and n<=10^9.
I tried doing this by using the following function:
Const amax=1000000000
Var i,j,n:longint;
a:array [1..amax] of boolean;
Function lp(n:longint):longint;
Var max:longint;
Begin
For i:=1 to n do a[i]:=true;
For i:=2 to round(sqrt(n)) do
If (a[i]=true) then
For j:=1 to n div i do
If (i*i+(j-1)*i<=n) then
a[i*i+(j-1)*i]:=false;
max:=0;
i:=n;
While max=0 do
Begin
If a[i]=true then max:=i;
i:=i-1;
End;
lp:=max;
End;
This code worked flawlessly for numbers such as 1 million, but when i tried n=10^9, the program took a long time to print the output. So here's my question: Are there any ways to improve my code for lower delay? Or maybe a different code?
The most important aspect here is that the greatest prime that is not greater than n must be fairly close to n. A quick look at The Gaps Between Primes (at The Prime Pages - always worth a look for everything to do with primes) shows that for 32-bit numbers the gaps between primes cannot be greater than 335. This means that the greatest prime not greater than n must be in the range [n - 335, n]. In other words, at most 336 candidates need to be checked - for example via trial division - and this is bound to be lots faster than sieving a billion numbers.
Trial division is a reasonable choice for tasks of this kind, because the range to be scanned is so small. In my answer to Prime sieve implementation (using trial division) in C++ I analysed a couple of ways for speeding it up.
The Sieve of Eratosthenes is also a good choice, it just needs to be modified to sieve only the range of interest instead of all numbers from 1 to n. This is called a 'windowed sieve' because it sieves only a window. Since the window will most likely not contain all the primes up to the square root of n (i.e. all the primes that could be potential least prime factors of composites in the range to be scanned) it is best to sieve the factor primes via a separate, simple Sieve of Eratosthenes.
First I'm showing a simple rendition of normal (non-windowed) sieve, as a baseline for comparing the windowed code to. I'm using C# in order to show the algorithm more clearly than would be possible with Pascal.
List<uint> small_primes_up_to (uint n)
{
if (n == uint.MaxValue)
throw new ArgumentOutOfRangeException("n", "n must be less than UINT32_MAX");
var eliminated = new bool[n + 1]; // +1 because indexed by numbers
eliminated[0] = true;
eliminated[1] = true;
for (uint i = 2, sqrt_n = (uint)Math.Sqrt(n); i <= sqrt_n; ++i)
if (!eliminated[i])
for (uint j = i * i; j <= n; j += i)
eliminated[j] = true;
return remaining_unmarked_numbers(eliminated, 0);
}
The fuction has 'small' in its name because it is not really suited for sieving big ranges; I use similar code (with a few bells and whistles) only for sieving the small factor primes needed by more advanced sieves.
The code for extracting the sieved primes is equally simple:
List<uint> remaining_unmarked_numbers (bool[] eliminated, uint sieve_base)
{
var result = new List<uint>();
for (uint i = 0, e = (uint)eliminated.Length; i < e; ++i)
if (!eliminated[i])
result.Add(sieve_base + i);
return result;
}
Now, the windowed version. One difference is that the potential least factor primes need to be sieved separately (by the function just shown) as explained earlier. Another difference is that the starting point of the crossing-off sequence for a given prime may lie outside the range to be sieved. If the starting point lies before the start of the window then a bit of modulo magic is necessary to find the first 'hop' that lands in the window. From then on everything proceeds as usual.
List<uint> primes_between (uint m, uint n)
{
m = Math.Max(m, 2);
if (m > n)
return new List<uint>(); // empty range -> no primes
// index overflow in the inner loop unless `(sieve_bits - 1) + stride <= UINT32_MAX`
if (n - m > uint.MaxValue - 65521) // highest prime not greater than sqrt(UINT32_MAX)
throw new ArgumentOutOfRangeException("n", "(n - m) must be <= UINT32_MAX - 65521");
uint sieve_bits = n - m + 1;
var eliminated = new bool[sieve_bits];
foreach (uint prime in small_primes_up_to((uint)Math.Sqrt(n)))
{
uint start = prime * prime, stride = prime;
if (start >= m)
start -= m;
else
start = (stride - 1) - (m - start - 1) % stride;
for (uint j = start; j < sieve_bits; j += stride)
eliminated[j] = true;
}
return remaining_unmarked_numbers(eliminated, m);
}
The two '-1' terms in the modulo calculation may seem strange, but they bias the logic down by 1 to eliminate the inconvenient case stride - foo % stride == stride that would need to be mapped to 0.
With this, the greatest prime not exceeding n could be computed like this:
uint greatest_prime_not_exceeding (uint n)
{
return primes_between(n - Math.Min(n, 335), n).Last();
}
This takes less than a millisecond all told, including the sieving of the factor primes and so on, even though the code contains no optimisations whatsoever. A good overview of applicable optimisations can be found in my answer to prime number summing still slow after using sieve; with the techniques shown there the whole range up to 10^9 can be sieved in about half a second.

How to store a symmetric matrix?

Which is the best way to store a symmetric matrix in memory?
It would be good to save half of the space without compromising speed and complexity of the structure too much. This is a language-agnostic question but if you need to make some assumptions just assume it's a good old plain programming language like C or C++..
It seems a thing that has a sense just if there is a way to keep things simple or just when the matrix itself is really big, am I right?
Just for the sake of formality I mean that this assertion is always true for the data I want to store
matrix[x][y] == matrix[y][x]
Here is a good method to store a symmetric matrix, it requires only N(N+1)/2 memory:
int fromMatrixToVector(int i, int j, int N)
{
if (i <= j)
return i * N - (i - 1) * i / 2 + j - i;
else
return j * N - (j - 1) * j / 2 + i - j;
}
For some triangular matrix
0 1 2 3
4 5 6
7 8
9
1D representation (stored in std::vector, for example) looks like as follows:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And call fromMatrixToVector(1, 2, 4) returns 5, so the matrix data is vector[5] -> 5.
For more information see http://www.codeguru.com/cpp/cpp/algorithms/general/article.php/c11211/TIP-Half-Size-Triangular-Matrix.htm
I find that many high performance packages just store the whole matrix, but then only read the upper triangle or lower triangle. They might then use the additional space for storing temporary data during the computation.
However if storage is really an issue then just store the n(n+1)/2 elements making the upper triangle in a one-dimensional array. If that makes access complicated for you, just define a set of helper functions.
In C to access a matrix matA you could define a macro:
#define A(i,j, dim) ((i <= j)?matA[i*dim + j]:matA[j*dim + i])
then you can access your array nearly normally.
Well I would try a triangular matrix, like this:
int[][] sym = new int[rows][];
for( int i = 0; i < cols; ++i ) {
sym=new int[i+1];
}
But then you wil have to face the problem when someone wants to access the "other side". Eg he wants to access [0][10] but in your case this val is stored in[10][0] (assuming 10x10).
The probably "best" way is the lazy one - dont do anything until the user requests. So you could load the specific row if the user types somethin like print(matrix[4]).
If you want to use a one dimensional array the code would look something like this:
int[] new matrix[(rows * (rows + 1 )) >> 1];
int z;
matrix[ ( ( z = ( x < y ? y : x ) ) * ( z + 1 ) >> 1 ) + ( y < x ? y : x ) ] = yourValue;
You can get rid of the multiplications if you create an additional look-up table:
int[] new matrix[(rows * (rows + 1 )) >> 1];
int[] lookup[rows];
for ( int i= 0; i < rows; i++)
{
lookup[i] = (i * (i+1)) >> 1;
}
matrix[ lookup[ x < y ? y : x ] + ( x < y ? x : y ) ] = yourValue;
If you're using something that supports operator overloading (e.g. C++), it's pretty easy to handle this transparently. Just create a matrix class that checks the two subscripts, and if the second is greater than the first, swap them:
template <class T>
class sym_matrix {
std::vector<std::vector<T> > data;
public:
T operator()(int x, int y) {
if (y>x)
return data[y][x];
else
return data[x][y];
}
};
For the moment I've skipped over everything else, and just covered the subscripting. In reality, to handle use as both an lvalue and an rvalue correctly, you'll typically want to return a proxy instead of a T directly. You'll want a ctor that creates data as a triangle (i.e., for an NxN matrix, the first row will have N elements, the second N-1, and so on -- or, equivalantly 1, 2, ...N). You might also consider creating data as a single vector -- you have to compute the correct offset into it, but that's not terribly difficult, and it will use a bit less memory, run a bit faster, etc. I'd use the simple code for the first version, and optimize later if necessary.
You could use a staggered array (or whatever they're called) if your language supports it, and when x < y, switch the position of x and y. So...
Pseudocode (somewhat Python style, but not really) for an n x n matrix:
matrix[n][]
for i from 0 to n-1:
matrix[i] = some_value_type[i + 1]
[next, assign values to the elements of the half-matrix]
And then when referring to values....
if x < y:
return matrix[y][x]
else:
return matrix[x][y]

Complexity of a given function

When I analyzed the complexity of the code segment below, I found that it is O(n/2). But while searching the internet I discovered that it is probably O(n). I'd like to know who's correct.
void function(int n) {
int i = 1, k = 100;
while (i < n) {
k++;
i += 2;
}
}
What is the point of the variable k in the above method? Regardless big-O notation talks about the behavior in the limit (as the value of n approaches infinity). As such, big-O notation is agnostic to BOTH scaling factors and constants. Which is to say, for any constant "c" and scaling factor "s"
O(f(n)) is equivalent to O(s*f(n) + c)
In your case f(n) = n, s = 1/2, and c = 0. So...
O(n) = O(n/2)
O(n) is the same as O(n/2)
The idea of big-O notation is to understand how fast an algorithm will run as you give it a larger input. So, for example, if you double the size of your input, will the program take twice as long or will it take 4 times as long.
Since both n and n/2 behave identically as you vary the value of N (i.e. if you increase N by a factor of 10, both N itself and N/2 scale identically).
O(n/2) = O(0.5n) = O(n). See Wikipedia for more on this.
If f is O(g), then there exist some c and n such that for all x > n, |f(x)| <= c * |g(x)|. That is, from input n onwards, c * g(x) dominates f(x).
It follows that O(n/2) = O(n), because,
If f(x) = x/2 and g(x) = x, then we set c = 0.5 and n = 0.
If f(x) = x and g(x) = x/2, then we set c = 2 and n = 0.
Note that there are infinitely many values for c and n that you can use to prove this. (In the above I minimized them, but that is not necessary.)