Reverse function - function

I have been trying to reverse a quite simple looking function.
the function is presented in assembly:
(Argument is loaded into AX)
AND AX, 0xFFFE (round down to even number)
MUL AX (Multiply AX by AX ; the result is represented as DX:AX)
XOR AX,DX
The function can be described as: H(X) = F(X & 0xFFFE); F(X) = ((X * X) mod 2^16) xor ((X * X) div 2^16)
Calculated all of the values from 1 to 2^16 and plotted on matlab in order to "see" some function.
Can anyone help me find an answer to this? (when given y what is the argument x).
It might be that for some values there is more than one answer, so narrowing it down is my goal.
Thanks,
Or.

It's a hash function.
You can't reverse a hash function, because the whole point of it is that it's a one way function.
The multiply is clearly reversible, it's the xor that's not. By combining the low and high part of the multiplication you lose information.
As you can see in the plot there are some white spaces, because there are 2^16 spaces in that plot that means there are also different input values that hash to the same value.
This is common in a hash function.
The only way to 'reverse' it is to build a lookup table that translates output values into possible input values. However you will find that for every output values that be 1 or more input values.
An even number x an even number is always a multiple of 4.
So the low 2 bits are always 0, ergo the low 2 bits of the result are bits 16+17 of the multiplication.
Bits 2..15 are a mix of bits 2..15 xor bits 18..31.
A quick simulation shows 24350 unique outputs ergo on average 1.34 0.34 duplicates for every input value, not bad.
The maximum number of collisions is 6, but most numbers don't collide.
For all those numbers that don't collide you can uniquely lookup your input value in the lookup table (all this disregarding odd input values obviously).

Related

LC-3 algorithm for converting ASCII strings to Binary Values

Figure 10.4 provides an algorithm for converting ASCII strings to binary values. Suppose the decimal number is arbitrarily long. Rather than store a table of 10 values for the thousands-place digit, another table for the 10 ten-thousands-place digit, and so on, design an algorithm to do the conversion without resorting to any tables whatsoever.
I have attached pictures of figure 10.4. I am not looking for an answer to the problem, but rather can someone please explain this problem and perhaps give some direction on how to go about creating the algorithm?
Figure 10.4
Figure 10.4 second image
I am unsure as to what it means by tables and do not know where to start really.
The tables are those global, initialized arrays: one called Lookup10 holding 10, 20, 30, 40, ..., and another called Lookup100 holding 100, 200, 300, 400...
You can ignore the tables: as per the assignment instructions you're supposed to find a different way to accomplish this anyway.  Or, you can run that code in simulator or mentally to understand how it works.
The bottom line is that LC-3, while it can do anything (it is turning complete), it can't do much in any one instruction.  For arithmetic & logic, it can do add, not, and.  That's pretty much it!  But that's enough — let's note that modern hardware does everything with only one logic gate, namely NAND, which is a binary operator (so NAND directly available; NOT by providing NAND with the same operand for both inputs; AND by doing NOT after NAND; OR using NOT on both inputs first and then NAND; etc..)
For example, LC-3 cannot multiply or divide or modulus or right shift directly — each of those operations is many instructions and in the general case, some looping construct.  Multiplication can be done by repetitive addition, and division/modulus by repetitive subtraction.  These are super inefficient for larger operands, and there are much more efficient algorithms that are also substantially more complex, so those greatly increase program complexity beyond that already with the repetitive operation approach.
That subroutine goes backwards through the use input string.  It takes a string length count in R1 as parameter supplied by caller (not shown).  It looks at the last character in the input and converts it from an ASCII character to a binary number.
(We would commonly do that conversion from ascii character to numeric value using subtraction: moving the character values from the ascii character range of 0x30..0x39 to numeric values in the range 0..9, but they do it with masking, which also works.  The subtraction approach integrates better with error detection (checking if not a valid digit character, which is not done here), whereas the masking approach is simpler for LC-3.)
The subroutine then obtains the 2nd last digit (moving backwards through the user's input string), converting that to binary using the mask approach.  That yields a number between 0 and 9, which is used as an index into the first table Lookup10.  The value obtained from the table at that index position is basically the index × 10.  So this table is a × 10 table.  The same approach is used for the third (and first or, last-going-backwards) digit, except it uses the 2nd table which is a × 100 table.
The standard approach for string to binary is called atoi (search it) standing for ascii to integer.  It moves forwards through the string, and for every new digit, it multiples the existing value, computed so far, by 10 before adding in the next digit's numeric value.
So, if the string is 456, the first it obtains 4, then because there is another digit, 4 × 10 = 40, then + 5 for 45, then × 10 for 450, then + 6 for 456, and so on.
The advantage of this approach is that it can handle any number of digits (up to overflow).  The disadvantage, of course, is that it requires multiplication, which is a complication for LC-3.
Multiplication where one operand is the constant 10 is fairly easy even in LC-3's limited capabilities, and can be done with simple addition without looping.  Basically:
n × 10 = n + n + n + n + n + n + n + n + n + n
and LC-3 can do those 9 additions in just 9 instructions.  Still, we can also observe that:
n × 10 = n × 8 + n × 2
and also that:
n × 10 = (n × 4 + n) × 2     (which is n × 5 × 2)
which can be done in just 4 instructions on LC-3 (and none of these needs looping)!
So, if you want to do this approach, you'll have to figure out how to go forwards through the string instead of backwards as the given table version does, and, how to multiply by 10 (use any one of the above suggestions).
There are other approaches as well if you study atoi.  You could keep the backwards approach, but now will have to multiply by 10, by 100, by 1000, a different factor for each successive digit .  That might be done by repetitive addition.  Or a count of how many times to multiply by 10 — e.g. n × 1000 = n × 10 × 10 × 10.

How to map number in a range to another in the same range with no collisions?

Effectively what I'm looking for is a function f(x) that outputs into a range that is pre-defined. Calling f(f(x)) should be valid as well. The function should be cyclical, so calling f(f(...(x))) where the number of calls is equal to the size of the range should give you the original number, and f(x) should not be time dependent and will always give the same output.
While I can see that taking a list of all possible values and shuffling it would give me something close to what I want, I'd much prefer it if I could simply plug values into the function one at a time so that I do not have to compute the entire range all at once.
I've looked into Minimal Perfect Hash Functions but haven't been able to find one that doesn't use external libraries. I'm okay with using them, but would prefer to not do so.
If an actual range is necessary to help answer my question, I don't think it would need to be bigger than [0, 2^24-1], but the starting and ending values don't matter too much.
You might want to take a look at Linear Congruential Generator. You shall be looking at full period generator (say, m=224), which means parameters shall satisfy Hull-Dobell Theorem.
Calling f(f(x)) should be valid as well.
should work
the number of calls is equal to the size of the range should give you the original number
yes, for LCG with parameters satisfying Hull-Dobell Theorem you'll get full period covered once, and 'm+1' call shall put you back at where you started.
Period of such LCG is exactly equal to m
should not be time dependent and will always give the same output
LCG is O(1) algorithm and it is 100% reproducible
LCG is reversible as well, via extended Euclid algorithm, check Reversible pseudo-random sequence generator for details
Minimal perfect hash functions are overkill, all you've asked for is a function f that is,
bijective, and
"cyclical" (ie fN=f)
For a permutation to be cyclical in that way, its order must divide N (or be N but in a way that's just a special case of dividing N). Which in turn means the LCM of the orders of the sub-cycles must divide N. One way to do that is to just have one "sub"-cycle of order N. For power of two N, it's also really easy to have lots of small cycles of some other power-of-two order. General permutations do not necessarily satisfy the cycle-requirement, of course they are bijective but the LCM of the orders of the sub-cycles may exceed N.
In the following I will leave all reduction modulo N implicit. Without loss of generality I will assume the range starts at 0 and goes up to N-1, where N is the size of the range.
The only thing I can immediately think of for general N is f(x) = x + c where gcd(c, N) == 1. The GCD condition ensures there is only one cycle, which necessarily has order N.
For power-of-two N I have more inspiration:
f(x) = cx where c is odd. Bijective because gcd(c, N) == 1 so c has a modular multiplicative inverse. Also cN=1, because φ(N)=N/2 (since N is a power of two) so cφ(N)=1 (Euler's theorem).
f(x) = x XOR c where c < N. Trivially bijective and trivially cycles with a period of 2, which divides N.
f(x) = clmul(x, c) where c is odd and clmul is carry-less multiplication. Bijective because any odd c has a carry-less multiplicative inverse. Has some power-of-two cycle length (less than N) so it divides N. I don't know why though. This is a weird one, but it has decent special cases such as x ^ (x << k). By symmetry, the "mirrored" version also works.
Eg x ^ (x >> k).
f(x) = x >>> k where >>> is bit-rotation. Obviously bijective, and fN(x) = x >>> Nk, where Nk mod N = 0 so it rotates all the way back to the unrotated position regardless of what k is.

Three-way xor-like function

I'm trying to solve the following puzzle:
Given a stream of numbers (only 1 iteration over them is allowed) in which all numbers appear 3 times, but 1 number appear only 2 times, find this number, using O(1) memory.
I started with the idea that, if all numbers appeared 2 times, and 1 number only once, I could use xor operation between all numbers and the result would be the incognito number.
So I want to extend this idea to solve the puzzle. All I need is a xor-like function (or operator), which would yield 0 on the third apply:
SEED xor3 X xor3 X xor3 X = SEED
X xor3 Y xor3 SEED xor3 X xor3 Y xor3 Y xor3 X = SEED
Any ideas for such a function?
Regard XOR as summation on each bit of a number expressed in binary (i.e. a radix of 2), modulo 2.
Now consider a numerical system consisting of tribits 0, 1, and 2. That is, it has a radix of 3.
The operator T now becomes an operation on any number, decomposed into this radix. As in XOR, you sum the bits, but the difference is that operator T is ran in modulo 3.
You can easily show that a T a T a is zero for any a. You can also show that T is both commutative and associative. That is necessary since, in general, your sequence will have the numbers jumbled up.
Now apply this to your list of numbers. At the end of the operation, the output will be b where b = o T o and o is the number that occurs exactly twice.
Your solution for the simpler case (all number appear twice, one number appears once) works since xor operates on each bit x as
x xor x = 0 and 0 xor x = x
xor is basically a bit-wise summation modulus 2. You would need the base-3 equivalent: Transform each number into a base-3 representation. And then use summation modulus 3 for each decimal:
0 1 2
0 0 1 2
1 1 2 0
2 2 0 1
Call this operation xor3. Now you have for each decimal x:
x xor3 x xor3 x = 0 and 0 xor3 x = x
If you apply that to all your numbers then all values that appear 3 times will vanish. The result is x xor3 x of the number x that appears twice. You need to apply decimal-wise division by 2 modulus 3.
I believe there are more efficient ways to implement that. The advantage of the xor function in the first case relies on the fact that xor is a natural base-2 operation. Is there any practical application for that?
This approach is a bit fragile: If the precondition (all numbers appear 3 times except one that appears twice) breaks the algorithm will not help you.
Take a Map with int-keys and int-values. Then walk through your numbers and for each number x increase each the according value. If x is a new key take 0 as start value.
Then you can analyze it easily: Walk through all keys and check the cardinality. It should be three for all keys except one that should be two. This is more robust and my gut feeling says it is also faster.

How to convert a QuadTree Cell's Spatial Index (Binary Index) to Position and Dimension values?

Sorry in advance for miss-using any terminology in this question, but basically I'm looking into creating a QuadTree that makes use of Binary Indexing, like this:
As you can see in the two illustrations above, if each cells are given a binary ID (ex: 1010, 1011) then every ODD binary indices controls the X offset and every EVEN binary indices controls the Y offset.
For example, in the case of the Level 2 grid (16 cells), 1010 (cell #10) could be said to have 1s at it's 4th and 2nd index, therefore those would perform two Y offsets. The first '1###' (on the leftmost side) would indicate an offset of one cell-height, then the second '##1#' would additionally offset it twice the cell height.
As in:
// If Cell Height = 64pixels
1### = 64 pixels
+ ##1# = 128 pixels
__________________
1#1# = 192 pixels
The same can be applied to the X axis, only it uses the odd numbers instead (ex: #1#1).
Now, when I initialize my QuadTree, I began calculating the maximum nodes it may contain if all cells and all depths are used. I have calculated this with the sum of 4 to the power of each depths:
_totalNodes = 0;
var t:int=0, tLen:int=_maxLevels;
for (; t<tLen; t++) {
_totalNodes += Math.pow(4, t); //Adds 1, 4, 16, 64, 256, etc...
}
Then, I create another loop (iterating from 0 to _totalNodes) which instantiates the nodes and stores it in a long array. It passes the current iteration integer to the Node constructor, and it stores it as it's index.
So far I've been able to determine which depth (aka: Level) the Node would be stored in by figuring out it's index's Most Significant Bit:
public static function MSB( pValue:uint ):int {
var bits:int = 0;
while ( pValue >>= 1) {
bits++;
}
return bits;
}
But now, I'm stuck trying to figure out how to convert the index from binary form to actual Cell X and Y positions. like I said above, the dimensions of each cells are found. It's just a matter of doing some logical operations on the whole index (or "bit-code" is the name I refer to in my code)
If you know of a good example that uses logical-operations (binary level) to convert the binary index values to X and Y positions, could you please post a link or explanation here?
Thanks!
Here's a reference where I got this idea from (note: different programming language):
L. Spiro Engine - http://lspiroengine.com/?p=530
I'm not familiar with the language used in that article though, so I can't really follow it and convert it easily to ActionScript 3.0.
your task is described by Hannan Samet.
This works by first building the quadtree, and then assign to each quad cell the coresponding morton code. (bit interleaving code).
once you have the code, you assign it to the objects in the quad. then you can delte the quad tree. you then can search by converting a coordinate to the coresponding morton code, and do a bin search on the morton index. Instead of morton (also called z order) you als can use hilbert or gray codes.

get random index numbers from a matrix, fortran 90

I am looking for a function or a way to get the index numbers of a 2D matrix:
my example is, I have A(Ly,Lx) where Ly = 100 and Lx = 100
I want to get a random index number of the matrix, such as : Random_node(A) = (random y, random x)
Then I want to do this repeatedly having the constraint that I don't want my random points to be repeated or even not to be close one to each other following a threshold of (let's say) 10 nodes of radius. The matrix is an eulerian 2D matrix (y,x).
Is at least the first question straightforward?
Thank you all!
Albert P
Here's one way of getting a random set of locations in your 100x100 matrix. First, declare a 100x100 matrix of reals:
real, dimension(100,100) :: randarray
then, put a random number into each element of that array
call random_number(randarray)
Now, an expression such as
randarray > 0.9
returns a logical array containing, approximately, 10% true values and 90% false. By tracking down the locations of the true values you have the random x-es and y-es that you seek. Indeed you may not need to find those locations at all, you can simply use the expression in masked assignments and similar operations, for example
where(randarray>0.9) a = func()
as long, of course, as func returns a scalar or a 100x100 array.
This approach guarantees that each location is different from all the others.
It does not however, address your constraint that the 'random' locations should not be too close to each other. That constraint, of course, is a little inconsistent with randomness.
You could, I suppose, break your 100x100 array into 10x10 blocks and choose, randomly, one element in each block. Would that be a good compromise between your constraints ?