Randomely creating a boolean function of N boolean variables - function

I want to write a program for randomely creating a function that receives as an input N binary values, and maps them into one binary value. The naïve approach would be to create all 2^(2^N) such functions, represented as truth tables, and choose one at random - but this is impractical for large N. In addition, as representing the chosen function in truth table is memory inefficient, it would be desirable to represent it as a formula y = f(x1,x2,...,xN).
Thanks!

Assume we have an array x[1..n] of Boolean values. The idea is to create an auxiliary Bernoulli Distribution b and apply and or or in sequence with a probability of 50%.
1. result:= true.
b := Bernoulli(0.5).
2. For i = 1 To n Do:
3. If next(b)
Then result := and(result, x[i])
Else result := or(result, x[i])
4. Return result

Related

Maxima: how to distinguish between a row of a matrix and a row vector?

I'd like to build a function that can take a vector (i.e. a 1xm or nx1 matrix) or a column / row of a matrix, as input; however, I've come up on something that seems a bit weird: even though maxima handles vectors as matrices with either 1 row or col, it has different requirements for referring to their elements.
For example:
aMatrix:matrix([1,2,3],[4,5,6]);
matrixVec: aMatrix[1];
aVec:matrix([1,2,3]);
Now, even though matrixVec and aVec were a) obtained from the matrix function, and have the same dimensions (as determined by length() and length(transpose()), referencing their elements requires completely different notations:
matrixVec[1,1]; returns an error;
whereas aVec[1,1]; returns 1, as expected.
I think I understand why this would be; however, because both of these objects return true from matrixp (and have the same dimensions), I have no idea how to distinguish them in my code, so that I can define proper handling.
What kind of if statement could I use to distinguish these two so that I can define value = x[i] for the matrix and value = x[1,i] for the row vector?
Stumbled upon a solution while working on something else: it turns out that Maxima treats the row or column of a matrix as a list, although it doesn't treat a row or column vector as a list, i.e. given
aMatrix : matrix([1,2,3],[4,5,6]);
matrixVec : aMatrix[1];
aVec : matrix([1,2,3]);
listp(matrixVec) returns "true" whereas listp(aVec) returns "false".
i.e. listp() can be used to distinguish a 1xm or nx1 matrix from the row or column of a matrix.

Generate unique serial from id number

I have a database that increases id incrementally. I need a function that converts that id to a unique number between 0 and 1000. (the actual max is much larger but just for simplicity's sake.)
1 => 3301,
2 => 0234,
3 => 7928,
4 => 9821
The number generated cannot have duplicates.
It can not be incremental.
Need it generated on the fly (not create a table of uniform numbers to read from)
I thought a hash function but there is a possibility for collisions.
Random numbers could also have duplicates.
I need a minimal perfect hash function but cannot find a simple solution.
Since the criteria are sort of vague (good enough to fool the average person), I am unsure exactly which route to take. Here are some ideas:
You could use a Pearson hash. According to the Wikipedia page:
Given a small, privileged set of inputs (e.g., reserved words for a compiler), the permutation table can be adjusted so that those inputs yield distinct hash values, producing what is called a perfect hash function.
You could just use a complicated looking one-to-one mathematical function. The drawback of this is that it would be difficult to make one that was not strictly increasing or strictly decreasing due to the one-to-one requirement. If you did something like (id ^ 2) + id * 2, the interval between ids would change and it wouldn't be immediately obvious what the function was without knowing the original ids.
You could do something like this:
new_id = (old_id << 4) + arbitrary_4bit_hash(old_id);
This would give the unique IDs and it wouldn't be immediately obvious that the first 4 bits are just garbage (especially when reading the numbers in decimal format). Like the last option, the new IDs would be in the same order as the old ones. I don't know if that would be a problem.
You could just hardcode all ID conversions by making a lookup array full of "random" numbers.
You could use some kind of hash function generator like gperf.
GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.
You could encrypt the ids with a key using a cryptographically secure mechanism.
Hopefully one of these works for you.
Update
Here is the rotational shift the OP requested:
function map($number)
{
// Shift the high bits down to the low end and the low bits
// down to the high end
// Also, mask out all but 10 bits. This allows unique mappings
// from 0-1023 to 0-1023
$high_bits = 0b0000001111111000 & $number;
$new_low_bits = $high_bits >> 3;
$low_bits = 0b0000000000000111 & $number;
$new_high_bits = $low_bits << 7;
// Recombine bits
$new_number = $new_high_bits | $new_low_bits;
return $new_number;
}
function demap($number)
{
// Shift the high bits down to the low end and the low bits
// down to the high end
$high_bits = 0b0000001110000000 & $number;
$new_low_bits = $high_bits >> 7;
$low_bits = 0b0000000001111111 & $number;
$new_high_bits = $low_bits << 3;
// Recombine bits
$new_number = $new_high_bits | $new_low_bits;
return $new_number;
}
This method has its advantages and disadvantages. The main disadvantage that I can think of (besides the security aspect) is that for lower IDs consecutive numbers will be exactly the same (multiplicative) interval apart until digits start wrapping around. That is to say
map(1) * 2 == map(2)
map(1) * 3 == map(3)
This happens, of course, because with lower numbers, all the higher bits are 0, so the map function is equivalent to just shifting. This is why I suggested using pseudo-random data for the lower bits rather than the higher bits of the number. It would make the regular interval less noticeable. To help mitigate this problem, the function I wrote shifts only the first 3 bits and rotates the rest. By doing this, the regular interval will be less noticeable for all IDs greater than 7.
It seems that it doesn't have to be numerical? What about an MD5-Hash?
select md5(id+rand(10000)) from ...

Iteratively declaring distinct variables

This question is strongly related to this and this question.
The distinct function of Z3
(declare-const a S)
(declare-const b S)
(assert (distinct a b))
allows constraining sets of variables (here a and b) such that all variables in the set must take different values.
My question is: is it also possible to force a variable to take a unique value without explicitly referring to the set of variables from which it should be distinct? Something like
(declare-unique-const a S)
(declare-unique-const b S)
(declare-unique-const c S)
This would be nice in situations where you declare new variables in an iterative process, for example, during program verification.
If it is not possible, I guess one has to keep track of all distinct variables and use that set to emit appropriate distinct (newvar, oldvar1, ..., oldvarn)) constraints.
We can define an auxiliary fresh function f from S to Int, and assert
f(a_1) = 1
f(a_1) = 2
f(a_3) = 3
...
f(a_n) = n
Then, a_1, ..., a_n must be different from each other.
If we want to say that b is also different from all a_is. We just assert
f(b) = n+1
In this approach, we only have to track the counter.

Iterate over characters in string in mysql

First at all I have a very concrete question, but maybe an alternative approach to my problem (second part) could also help me.
Is there a way to address a character in a string via its index in mysql. (i.e. in PHP $var[2] will give you the 3rd charater)?
The obvious way is SUBSTRING(var, 3,1 ) but since my strings are 1024 character long I assume this is not the fastest solution. As displayed in the code sample using substring to retrieve the tail of the string also gain no performance difference. Is there maybe a way to iterate over a string? (Shift the first element?)
CREATE FUNCTION hashDiff( hash1 TEXT(1024), hash2 TEXT(1024), threshold INT)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE diff, x, b1, b2 INT;
SET diff =0;
SET x = 0;
WHILE (x<1024 AND diff<threshold) DO
SET b1 = ASCII(hash1); --uses first character only!!
SET b2 = ASCII(hash2);
SET hash1=SUBSTRING(hash1, 2 );
SET hash2=SUBSTRING(hash2, 2 );
SET diff=diff+ ((b1-b2)*(b1-b2));
SET x=x+1;
END WHILE;
RETURN diff;
END
If you not already read it from the code, I try to write a stored procedure to calculate the difference or distance between to hashes. The difference is the sum of the character-wise square distances (i.e. hashDiff(AA,AC)=(65-65)²+(65-67)²=4). The first major performance boost could be achieved by introducing a threshold to cancel the calculation if the hashes are already to different. But since mysql is not my "every day" language, I stuck at this point in finding other optimizations. For completeness two sample hashes:
YAAAAAAYAAAYAAVAAQAARAOAAOAQASAQAMAKAKAJIAJAJIAHAHIAKJAIIAHHAHIIAIHGAGFFAGGFEAFEEEEAEDDDDDAEEEEDEEEFAFFFFFFEFFFEFFFFFGFEEFFEEEFFFJEFFEEEEEEELFFFFEEFJEEEEDIEEEEEIEEEEHEEEJEEFKFEFKGGFNHGOIIJTJKYONYNMTGHNHHQISJJQIKWLXJJSMYRQWJOGKDDFCCBBAAAAAAAAAAAAAAAAAAAAAAAYAAAAAAYAAAYAAWAARAASASAAQARAUAYAYATAOALKAJAJIAIAHHAHGAGFAFFAEFFAEFFAFFFAEEFFAFEEEDADEDDDDADDDCDDDDDAEEEFEEEEDDDEEEDEDDEEEFEFFGGFMFHGFFFGFFFLGHGGHGGNHHGGGOHGHGHMGGFGMFFFMFGFLFFFMGFFMGGMGGGNGGMGGLGGLGGMGGLEIEEHDCGCGCDGDGDCGDFCECCECECECECFCECFCFCFCFCFCGCJGYCYAAAAAAYAAAYAAUAATAAUAUAAUARARAQAPAPASARRAPARQAPAQQAQQAQSAKMATKKAIIHAIHGAGGGGAGHHGGAGGFGFFAFFGEFFFFFAFFGFGGGFFFEEFGFFGGFGGHIJJLKLWLKJJIJJJKJRLJKLKKKUKLLKKUMMKJIQIIIISKJJWKLLXMLMYMLNYMMYMLLWJIQIINFGKFFKEEIDHEDHDDFCECCFDECCFCFDGCDGCGCGEGCDCECECFDFCGDGCIEKEOAYNFBREUXKPQMMQTKTMMNJLPPVYYYTOUOPOLLJKKJJJIJIMJJJLIJJLLJIIHHIHHHIGHIHIHJHHHJHHIHGHGHFGHGFFEFEEEFEFEFFGGHIHIHGHGHHIIIIHIIJMNLONKLKKKKKKKMLKKLONMKOOOMLOPONMNMKKLLKKLMNKLMMMNMOPPOORPORSSVRTSSRTRRTSSTTXSTQRPONOKKLKLJMKJJIJIIHHHIIIJHIJIJJIJIKJIMWMYYDAAAAAAAAAAA
AAAAAAAAAAABAABAACAACACAACADADAEADADADADDAEAEEAEAFEAEEAEFAFGAGGGAGGGAHHHAHIIIAIHIJHAIIHIHHAJIHIJIJKJAJJJIKJJJJKKJKJKKLKLKLLMMMNNMYOOOOOOPOONYOONONNPYNOOOPYOOPPPYNONNYMLLWLLKUJIISHIHOGGMFGFLFFMGGLFGLGFLFFKFKFFLEEKFLEFJFKFGNGNHLFHJFIEGDIEKGOIRFGBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABABACACDACADDACADDADDAEDAFEAEEFAFFFAFFGAFGGGAIGHHIAHIHHHHAHIHIIJJIIAIIIIJIJKJIIIIJJHIIHIIIIJIIIIRJJJJKJJJJLVKLLKLLKXLMMKMXMLLLMWMMMMYMNLYMNNYNNMYMMNYMLYLMLXKJRIHPHIMGGMFEJEJEEIEEHDGCDFCFDCFCECECCEBEBECFDGCFDNGLDBAAAAAAAAAAAAAAAAAAAAAABAAAAABAABABAACACACACACACACADDADAEEAFAFGAFGAHGAGGAGGHAGGIAIHJAJJJJAJKKKKAMLMNNNANOMMNNMMNAONMNOOOMOOPOMNOMMNPOOPPPPRQQYPPRPPPPPNOYLLMMMMLYLMLMLYLMLMMYLNNMYNLLWMLKXLLLUKIKQIIQGHHPFHNGFLFFLGFJEEJEIDDIDCHDFCDGCFCCFCECECCECFCGDGDHDHDIFIDEBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAABBBBBCBCCCCDCCCCCCCCDDDDEDEEEEFFDEGGHGHHHGHHHHHHIIJJJJJIJJJJJJIKJJKLKKMMNMMMMMMMNNNNNNLNNONPONNNOOOOPQQQRSSSSSSUTSTUUUVWVVXUYXWVXVXWYVYWYVYYUWVUTTSSPQPQOPOPONONOMONOOONNNMMNLJJKJIIJHHGGGFHFGFFFFEEEDDEEEEFGGIGJLRNEAAAAAAAAAAAAA
Any help or hint would be appreciated.
The only way you would be able to use an array of sorts would be to use a temporary tables and cursors/resultsets.
The problem is you will still need to iterate over the strings and use substring to break them apart. To my knowledge there is no 'wordwrap' or 'explode' function to chop up the string.

(Ordered) Set Partitions in fixed-size Blocks

Here is a function I would like to write but am unable to do so. Even if you
don't / can't give a solution I would be grateful for tips. For example,
I know that there is a correlation between the ordered represantions of the
sum of an integer and ordered set partitions but that alone does not help me in
finding the solution. So here is the description of the function I need:
The Task
Create an efficient* function
List<int[]> createOrderedPartitions(int n_1, int n_2,..., int n_k)
that returns a list of arrays of all set partions of the set
{0,...,n_1+n_2+...+n_k-1} in number of arguments blocks of size (in this
order) n_1,n_2,...,n_k (e.g. n_1=2, n_2=1, n_3=1 -> ({0,1},{3},{2}),...).
Here is a usage example:
int[] partition = createOrderedPartitions(2,1,1).get(0);
partition[0]; // -> 0
partition[1]; // -> 1
partition[2]; // -> 3
partition[3]; // -> 2
Note that the number of elements in the list is
(n_1+n_2+...+n_n choose n_1) * (n_2+n_3+...+n_n choose n_2) * ... *
(n_k choose n_k). Also, createOrderedPartitions(1,1,1) would create the
permutations of {0,1,2} and thus there would be 3! = 6 elements in the
list.
* by efficient I mean that you should not initially create a bigger list
like all partitions and then filter out results. You should do it directly.
Extra Requirements
If an argument is 0 treat it as if it was not there, e.g.
createOrderedPartitions(2,0,1,1) should yield the same result as
createOrderedPartitions(2,1,1). But at least one argument must not be 0.
Of course all arguments must be >= 0.
Remarks
The provided pseudo code is quasi Java but the language of the solution
doesn't matter. In fact, as long as the solution is fairly general and can
be reproduced in other languages it is ideal.
Actually, even better would be a return type of List<Tuple<Set>> (e.g. when
creating such a function in Python). However, then the arguments wich have
a value of 0 must not be ignored. createOrderedPartitions(2,0,2) would then
create
[({0,1},{},{2,3}),({0,2},{},{1,3}),({0,3},{},{1,2}),({1,2},{},{0,3}),...]
Background
I need this function to make my mastermind-variation bot more efficient and
most of all the code more "beautiful". Take a look at the filterCandidates
function in my source code. There are unnecessary
/ duplicate queries because I'm simply using permutations instead of
specifically ordered partitions. Also, I'm just interested in how to write
this function.
My ideas for (ugly) "solutions"
Create the powerset of {0,...,n_1+...+n_k}, filter out the subsets of size
n_1, n_2 etc. and create the cartesian product of the n subsets. However
this won't actually work because there would be duplicates, e.g.
({1,2},{1})...
First choose n_1 of x = {0,...,n_1+n_2+...+n_n-1} and put them in the
first set. Then choose n_2 of x without the n_1 chosen elements
beforehand and so on. You then get for example ({0,2},{},{1,3},{4}). Of
course, every possible combination must be created so ({0,4},{},{1,3},{2}),
too, and so on. Seems rather hard to implement but might be possible.
Research
I guess this
goes in the direction I want however I don't see how I can utilize it for my
specific scenario.
http://rosettacode.org/wiki/Combinations
You know, it often helps to phrase your thoughts in order to come up with a solution. It seems that then the subconscious just starts working on the task and notifies you when it found the solution. So here is the solution to my problem in Python:
from itertools import combinations
def partitions(*args):
def helper(s, *args):
if not args: return [[]]
res = []
for c in combinations(s, args[0]):
s0 = [x for x in s if x not in c]
for r in helper(s0, *args[1:]):
res.append([c] + r)
return res
s = range(sum(args))
return helper(s, *args)
print partitions(2, 0, 2)
The output is:
[[(0, 1), (), (2, 3)], [(0, 2), (), (1, 3)], [(0, 3), (), (1, 2)], [(1, 2), (), (0, 3)], [(1, 3), (), (0, 2)], [(2, 3), (), (0, 1)]]
It is adequate for translating the algorithm to Lua/Java. It is basically the second idea I had.
The Algorithm
As I already mentionend in the question the basic idea is as follows:
First choose n_1 elements of the set s := {0,...,n_1+n_2+...+n_n-1} and put them in the
first set of the first tuple in the resulting list (e.g. [({0,1,2},... if the chosen elements are 0,1,2). Then choose n_2 elements of the set s_0 := s without the n_1 chosen elements beforehand and so on. One such a tuple might be ({0,2},{},{1,3},{4}). Of
course, every possible combination is created so ({0,4},{},{1,3},{2}) is another such tuple and so on.
The Realization
At first the set to work with is created (s = range(sum(args))). Then this set and the arguments are passed to the recursive helper function helper.
helper does one of the following things: If all the arguments are processed return "some kind of empty value" to stop the recursion. Otherwise iterate through all the combinations of the passed set s of the length args[0] (the first argument after s in helper). In each iteration create the set s0 := s without the elements in c (the elements in c are the chosen elements from s), which is then used for the recursive call of helper.
So what happens with the arguments in helper is that they are processed one by one. helper may first start with helper([0,1,2,3], 2, 1, 1) and in the next invocation it is for example helper([2,3], 1, 1) and then helper([3], 1) and lastly helper([]). Of course another "tree-path" would be helper([0,1,2,3], 2, 1, 1), helper([1,2], 1, 1), helper([2], 1), helper([]). All these "tree-paths" are created and thus the required solution is generated.