I am looking for a function or a way to get the index numbers of a 2D matrix:
my example is, I have A(Ly,Lx) where Ly = 100 and Lx = 100
I want to get a random index number of the matrix, such as : Random_node(A) = (random y, random x)
Then I want to do this repeatedly having the constraint that I don't want my random points to be repeated or even not to be close one to each other following a threshold of (let's say) 10 nodes of radius. The matrix is an eulerian 2D matrix (y,x).
Is at least the first question straightforward?
Thank you all!
Albert P
Here's one way of getting a random set of locations in your 100x100 matrix. First, declare a 100x100 matrix of reals:
real, dimension(100,100) :: randarray
then, put a random number into each element of that array
call random_number(randarray)
Now, an expression such as
randarray > 0.9
returns a logical array containing, approximately, 10% true values and 90% false. By tracking down the locations of the true values you have the random x-es and y-es that you seek. Indeed you may not need to find those locations at all, you can simply use the expression in masked assignments and similar operations, for example
where(randarray>0.9) a = func()
as long, of course, as func returns a scalar or a 100x100 array.
This approach guarantees that each location is different from all the others.
It does not however, address your constraint that the 'random' locations should not be too close to each other. That constraint, of course, is a little inconsistent with randomness.
You could, I suppose, break your 100x100 array into 10x10 blocks and choose, randomly, one element in each block. Would that be a good compromise between your constraints ?
Related
All examples that I can find on the Internet just visualize the result array of the function computeSpectrum, but I am tasked with something else.
I generate a music note and I need by analyzing the result array to be able to say what note is playing. I figured out that I need to set the second parameter of the function call 'FFTMode' to true and then it returns sound frequencies. I thought that really it should return only one non-zero value which I could use to determine what note I generated using Math.sin function, but it is not the case.
Can somebody suggest a way how I can accomplish the task? Using the soundMixer.computeSpectrum is a requirement because I am going to analyze more complex sounds later.
FFT will transform your signal window into set of Nyquist sine waves so unless 440Hz is one of them you will obtain more than just one nonzero value! For a single sine wave you would obtain 2 frequencies due to aliasing. Here an example:
As you can see for exact Nyquist frequency the FFT response is single peak but for nearby frequencies there are more peaks.
Due to shape of the signal you can obtain continuous spectrum with peaks instead of discrete values.
Frequency of i-th sample is f(i)=i*samplerate/N where i={0,1,2,3,4,...(N/2)-1} is sample index (first one is DC offset so not frequency for 0) and N is the count of samples passed to FFT.
So in case you want to detect some harmonics (multiples of single fundamental frequency) then set the samplerate and N so samplerate/N is that fundamental frequency or divider of it. That way you would obtain just one peak for harmonics sinwaves. Easing up the computations.
I am to find the smallest distance between a given set of points and the origin. I have a matrix with 2 columns and 10 rows. Each row represents coordinates. One point consists of two coordinates and I would like to calculate the smallest distance between each point and to the origin. I would also like to determine which point gave this smallest distance.
In Octave, I calculate this distance by using norm and for each point in my set, I have a distance associated with them and the smallest distance is obviously the one I'm looking for. However, the code I wrote below isn't working the way it should.
function [dist,koor] = bonus4(S)
S= [-6.8667, -44.7967;
-38.0136, -35.5284;
14.4552, -27.1413;
8.4996, 31.7294;
-17.2183, 28.4815;
-37.5100, 14.1941;
-4.2664, -24.4428;
-18.6655, 26.9427;
-15.8828, 18.0170;
17.8440, -22.9164];
for i=1:size(S)
L=norm(S(i, :))
dist=norm(S(9, :));
koor=S(9, :) ;
end
i = 9 is the correct answer, but I need Octave to put that number in. How do I tell Octave that this is the number I want? Specifically:
dist=norm(S(9, :));
koor=S(9, :);
I cannot use any packages. I found the geometry package online but I am to solve the task without additional packages.
I'll work off of your original code. Firstly, you want to compute the norm of all of the points and store them as individual elements in an array. Your current code isn't doing that and is overwriting the variable L which is a single value at each iteration of the loop.
You'll want to make L an array and store the norms at each iteration of the loop. Once you do this, you'll want to find the location as well as the minimum distance itself. That can be done with one call to min where the first output gives you the minimum distance and the second output gives you the location of the minimum. You can use the second output to slice into your S array to retrieve the actual point.
Last but not least, you need to define S first before calling this function. You are defining S inside the function and that will probably give you unintended results if you want to change the input into this function at each invocation. Therefore, define S first, then call the function:
S= [-6.8667, -44.7967;
-38.0136, -35.5284;
14.4552, -27.1413;
8.4996, 31.7294;
-17.2183, 28.4815;
-37.5100, 14.1941;
-4.2664, -24.4428;
-18.6655, 26.9427;
-15.8828, 18.0170;
17.8440, -22.9164];
function [dist,koor] = bonus4(S)
%// New - Create an array to store the distances
L = zeros(size(S,1), 1);
%// Change to iterate over number of rows
for i=1:size(S,1)
L(i)=norm(S(i, :)); %// Change
end
[dist,ind] = min(L); %// Find the minimum distance
koor = S(ind,:); %// Get the actual point
end
Or, make sure you save the above function in a file called bonus4.m, then do this in the Octave command prompt:
octave:1> S= [-6.8667, -44.7967;
> -38.0136, -35.5284;
> 14.4552, -27.1413;
> 8.4996, 31.7294;
> -17.2183, 28.4815;
> -37.5100, 14.1941;
> -4.2664, -24.4428;
> -18.6655, 26.9427;
> -15.8828, 18.0170;
> 17.8440, -22.9164];
octave:2> [dist,koor] = bonus4(S);
Though this code works, I'll debate that it's slow as you're using a for loop. A faster way would be to do this completely vectorized. Because using norm for matrices is different than with vectors, you'll have to compute the distance yourself. Because you are measuring the distance from the origin, you can simply square each of the columns individually then add the columns of each row.
Therefore, you can just do this:
S= [-6.8667, -44.7967;
-38.0136, -35.5284;
14.4552, -27.1413;
8.4996, 31.7294;
-17.2183, 28.4815;
-37.5100, 14.1941;
-4.2664, -24.4428;
-18.6655, 26.9427;
-15.8828, 18.0170;
17.8440, -22.9164];
function [dist,koor] = bonus4(S)
%// New - Computes the norm of each point
L = sqrt(sum(S.^2, 2));
[dist,ind] = min(L); %// Find the minimum distance
koor = S(ind,:); %// Get the actual point
end
The function sum can be used to sum over a dimension independently. As such, by doing S.^2, you are squaring each term in the points matrix, then by using sum with the second parameter as 2, you are summing over all of the columns for each row. Taking the square root of this result computes the distance of each point to the origin, exactly the way the for loop functions. However, this (at least to me) is more readable and I daresay faster for larger sizes of points.
Sorry in advance for miss-using any terminology in this question, but basically I'm looking into creating a QuadTree that makes use of Binary Indexing, like this:
As you can see in the two illustrations above, if each cells are given a binary ID (ex: 1010, 1011) then every ODD binary indices controls the X offset and every EVEN binary indices controls the Y offset.
For example, in the case of the Level 2 grid (16 cells), 1010 (cell #10) could be said to have 1s at it's 4th and 2nd index, therefore those would perform two Y offsets. The first '1###' (on the leftmost side) would indicate an offset of one cell-height, then the second '##1#' would additionally offset it twice the cell height.
As in:
// If Cell Height = 64pixels
1### = 64 pixels
+ ##1# = 128 pixels
__________________
1#1# = 192 pixels
The same can be applied to the X axis, only it uses the odd numbers instead (ex: #1#1).
Now, when I initialize my QuadTree, I began calculating the maximum nodes it may contain if all cells and all depths are used. I have calculated this with the sum of 4 to the power of each depths:
_totalNodes = 0;
var t:int=0, tLen:int=_maxLevels;
for (; t<tLen; t++) {
_totalNodes += Math.pow(4, t); //Adds 1, 4, 16, 64, 256, etc...
}
Then, I create another loop (iterating from 0 to _totalNodes) which instantiates the nodes and stores it in a long array. It passes the current iteration integer to the Node constructor, and it stores it as it's index.
So far I've been able to determine which depth (aka: Level) the Node would be stored in by figuring out it's index's Most Significant Bit:
public static function MSB( pValue:uint ):int {
var bits:int = 0;
while ( pValue >>= 1) {
bits++;
}
return bits;
}
But now, I'm stuck trying to figure out how to convert the index from binary form to actual Cell X and Y positions. like I said above, the dimensions of each cells are found. It's just a matter of doing some logical operations on the whole index (or "bit-code" is the name I refer to in my code)
If you know of a good example that uses logical-operations (binary level) to convert the binary index values to X and Y positions, could you please post a link or explanation here?
Thanks!
Here's a reference where I got this idea from (note: different programming language):
L. Spiro Engine - http://lspiroengine.com/?p=530
I'm not familiar with the language used in that article though, so I can't really follow it and convert it easily to ActionScript 3.0.
your task is described by Hannan Samet.
This works by first building the quadtree, and then assign to each quad cell the coresponding morton code. (bit interleaving code).
once you have the code, you assign it to the objects in the quad. then you can delte the quad tree. you then can search by converting a coordinate to the coresponding morton code, and do a bin search on the morton index. Instead of morton (also called z order) you als can use hilbert or gray codes.
The problem is as follows,
I would be given a set of x and y coordinates(an coordinate array of around 30 to 40 thousand) of a long rope. The rope is lying on the ground and can be in any shape.
Now I would be given a start point(essentially x and y coordinate) and an ending point.
What is the efficient way to determine the set of x and y coordinates from the above mentioned coordinate array lie between the start and end points.
Exhaustive searching ie looping 40k times is not an acceptable solution (mentioned on the question paper)
A little bit margin for error is acceptable
We need to find the start point in the array, then the end point. For each, we can think of the rope as describing a function of distance from that point, and we're looking for the lowest point on that distance graph. If one point is a long way away and another is pretty close, we can do some kind of interpolation guess of where to search next.
distance
| /---\
|-- \ /\ -
| -- ------- -- ------ ---------- -
| \ / \---/ \--/
+-----------------------X--------------------------- array index
In the representation above, we want to find "X"... we look at the distances at a few points, get an impression of the slope of the distance curve, possibly even the rate of change of that slope, to help guide our next bit of probing....
To refine the basic approach of doing binary- or interpolated- searches in areas where we know the distance values are low, we may be able to use the following:
if we happen to be given the rope length and know the coordinate samples are equidistant along the rope, then we can calculate a maximum change in distance from our target point per sample.
if we know the rope has a stiffness ensuring it can't loop in a trivially small diameter, then
there's a known limit to how fast the slope of the curve can change
distance curve converges to vertical on both sides of the 0 point
you could potentially cross-reference/combine distance with, or use instead, the direction of each point from the target: only at the target would the direction instantly change ~180 degrees (how well the data points capture this still depends on the distance between adjacent samples and any stiffness of the rope).
Otherwise, there's always risk the target point may weirdly be encased by two very distance points, frustrating our whole searching algorithm (that must be what they mean about some margin for error - every now and then this search would have to revert to a O(N) brute-force search because any trend analysis fails).
For a one-time search, sometimes linear traversal is the simplest, fastest solution. Maybe that's the case for this problem.
Iterate through the ordered list of points until finding the start or end, and then collect points until hitting the other endpoint.
Now, if we expected to repeat the search, we could build an index to the points.
Edit: This presumes no additional constraints beyond those mentioned by #koool. Constraining the distance between the points would allow the hill-climbing approach described in #Tony's answer.
I don't think you can solve it accurately using anything other than exhaustive search. Say for cases where the rope is folded into half and the resulting double rope forms a spiral with the two ends on the centre.
However if we assume that long portions of the rope are in straight line, then we can eliminate a lot of points based on the slope check:
if (abs(slope(x[i],y[i],x[i+1],y[i+1])
-slope(x[i+1],y[i+1],x[i+2],y[i+2]))<tolerance)
eliminate (x[i+1],y[i+1]);
This will reduce the search time significantly if large portions of the rope are in straight line. But will be linear WRT number of remaining points.
So basically, you've got a sorted list of the points that comprise the entire rope and you're given two arbitrary points from within that list, and tasked with returning the sublist that exists between those two points.
I'm going to make the assumption that the start and end points that are provided are guaranteed to coincide exactly with points within the sorted list (otherwise it introduces a host of issues, particularly if the rope may be arbitrarily thin and passes by the start/end points multiple times).
That means all you're really looking for are the indices of the two provided coordinates. Or the index of one, and the answer to "is the second coordinate to the right or to the left?".
A simple O(n) solution to that would be:
For each index in array
coord = array[index]
if (coord == point1)
startIndex = index
if (coord == point2)
endIndex = index
if (endIndex < startIndex)
swap(startIndex, endIndex)
return array.sublist(startIndex, endIndex)
Or, if you wanted to optimize for repeated queries, I'd suggest a hashing based approach where you map each cooordinate to its index in the array. Something like:
//build the map (do this once, at init)
map = {}
For each index in array
coord = array[index]
map[coord] = index
//find a sublist (do this for each set of start/end points)
startIndex = map[point1]
endIndex = map[point2]
if (endIndex < startIndex)
swap(startIndex, endIndex)
return array.sublist(startIndex, endIndex)
That's O(n) to build the map, but once it's built you can determine the sublist between any two points in O(1). Assuming an efficient hashmap, of course.
Note that if my assumption doesn't hold, then the same solutions are still usable, provided that as a first step you take the provided start and end points and locate the points in the array that best correspond to each one. As noted, unless you are given some constraints regarding the thickness of the rope then interpolating from an arbitrary coordinate to one that's actually part of the rope can only be guesswork at best.
I am faced with a problem where I have to calculate intersections between all pairs in a collection of sets. None of the sets are smaller than a small constant k, and I'm only interested in whether two sets have an intersection larger than k-1 elements or not. I do not need the actual intersections nor the exact size, only whether it's larger than k-1 or not. Is there some clever pre-processing trick or a neat set intersection algorithm that I could use to speed things up?
More info that can be useful to answer the question:
The sets represent maximal cliques in a large, undirected, sparse graph. The number of sets can be in the order of tens of thousands or more, but most of the sets are likely to be small.
The sets are already sorted members of each set are in increasing order. Effectively they are sorted lists - I receive them this way from an underlying library for maximal clique search.
Nothing is known about the distribution of elements in the sets (i.e. whether they are in tight clumps or not).
Most of the set intersections are likely to be empty, so the ideal solution would be a clever data structure that helps me cut down the number of set intersections I have to make.
Consider a mapping with all sets of size k as the keys and corresponding values of lists of all sets from your collection that contain the key as a subset. Given this mapping, you don't need to perform any intersection tests: for each key, all pairs of sets from the list will have an intersection of size at least k. This approach can produce the same pair of sets more than once, so that will need to be checked.
The mapping is easy enough to calculate. For each set in the collection, calculate all the size-k subsets and append the original set to the list for that key set. But is this actually faster? In general, no. The performance of this approach will depend on the distribution of the sizes of the sets in the collection and the value of k. With d distinct elements in the sets, you could have as many as d choose k keys, which can be very large.
However, the basic idea is usable to reduce the number of intersections. Instead of using sets of size k, use smaller ones of fixed size q as the keys. The values are again lists of all sets that have the key as a subset. Now, test each pair of sets from the list for intersection. Thus, with q=1 you only test those pairs of sets that have at least one element in common, with q=2 you only test those pairs of sets that have at least two elements in common, and so on. The optimal value for q will depend on the distribution of sizes of the sets, I think.
For the sets in question, a good choice might be q=2. The keys are then just the edges of the graph, giving a predictable size to the mapping. Since most sets are expected to be disjoint, q=2 should eliminate a lot of comparisons without much additional overhead.
One possible optimization, which is more effective the smaller the range of values contained in each set:
Create a list of all the sets, sorted by their kth-greatest element (this is easy to find, since you already have each set with its elements in order). Call this list L.
For any two sets A and B, their intersection cannot have as many as k elements in it if the kth-greatest element in A is less than the least element in B.
So, for each set in turn, calculate its intersection only with the sets in the relevant part of L.
You can use the same fact to exit early from computing the intersection of any two sets - if there are only n-1 elements left to compare in one of the sets, and the intersection so far contains at most k-n elements, then stop. The above procedure is simply this rule applied to all the sets in L at once, with n=k, at the point where we're looking at the least element of set B and the kth-greatest element of A.
The following strategy should be quite efficient. I've used variations of this for intersecting ascending sequences on a number of occasions.
First I assume that you have some sort of priority queue available (if not, rolling your own heap is pretty easy). And a fast key/value lookup (btree, hash, whatever).
With that said, here is pseudocode for an algorithm that should do what you want quite efficiently.
# Initial setup
sets = array of all sets
intersection_count = key/value lookup with keys = (set_pos, set_pos) and values are counts.
p_queue = priority queue whose elements are (set[0], 0, set_pos), organized by set[0]
# helper function
def process_intersections(current_sets):
for all pairs of current_sets:
if pair in intersection_count:
intersection_count[pair] += 1
else:
intersection_count[pair] = 1
# Find all intersections
current_sets = []
last_element = first element of first thing in p_queue
while p_queue is not empty:
(element, ind, set_pos) = get top element from p_queue
if element != last_element:
process_intersections(current_sets)
last_element = element
current_sets = []
current_sets.append(set_pos)
ind += 1
if ind < len(sets[set_pos]):
add (sets[set_pos][ind], ind, set_pos) to p_queue
# Don't forget the last one!
process_intersections(current_sets)
final answer = []
for (pair, count) in intersection_count.iteritems():
if k-1 < count:
final_answer.append(pair)
The running time will be O(sum(sizes of sets) * log(number of sets) + count(times a point is in a pair of sets). In particular note that if two sets have no intersection, you never try to intersect them.
What if you used a predictive subset as a prequalifier. Pre-sort, but use a subset intersection as a threshold condition. If subset intersection > n% then complete the intersection, otherwise abandon. n then becomes the inverse of your comfort level with the prospect of a false positive.
You could also sort by the subset intersections(m) calculated earlier and begin running the full intersection ordered by m descending. So presumably the majority of your highest m intersections would likely cross your k threshold on the full subset and the probably of hitting your k threshold would continually decrease.
This really starts to treat the problem as NP-Complete.