I have a series of values that are each being stored as UInt16. Each of these numbers represents a bitmask - these numbers are commands that have been sent to a microprocessor telling it which pins to set high or low. I would like to parse this arrow of commands to find out which pins were being set high each time in such a way that is easier to analyse later.
Consider the example value 0x3c00, which in decimal is 15360 and in binary is 0011110000000000. Currently I have the following function
function read_message(hex_rep)
return findall.(x -> x .== '1',bitstring(hex_rep))
end
Which gets called on every element of the array of UInt16. Is there a better/more efficient way of doing this?
The best approach probably depends on how you want to handle vectors of hex-values. But here's an approach for processing a single hex which is much faster than the one in the OP:
function readmsg(x::UInt16)
N = count_ones(x)
inds = Vector{Int}(undef, N)
if N == 0
return inds
end
k = trailing_zeros(x)
x >>= k + 1
i = N - 1
inds[N] = n = 16 - k
while i >= 1
(x, r) = divrem(x, 0x2)
n -= 1
if r == 1
inds[i] = n
i -= 1
end
end
return inds
end
I can suggest padding your vector into a Vector{UInt64} and use that to manually construct a BitVector. The following should mostly work (even for input element types other than UInt16), but I haven't taken into account specific endianness you might want to respect:
julia> function read_messages(msgs)
bytes = reinterpret(UInt8, msgs)
N = length(bytes)
nchunks, remaining = divrem(N, sizeof(UInt64))
padded_bytes = zeros(UInt8, sizeof(UInt64) * cld(N, sizeof(UInt64)))
copyto!(padded_bytes, bytes)
b = BitVector(undef, N * 8)
b.chunks = reinterpret(UInt64, padded_bytes)
return b
end
read_messages (generic function with 1 method)
julia> msgs
2-element Vector{UInt16}:
0x3c00
0x8000
julia> read_messages(msgs)
32-element BitVector:
0
0
0
0
0
0
0
0
0
⋮
0
0
0
0
0
0
0
1
julia> read_messages(msgs) |> findall
5-element Vector{Int64}:
11
12
13
14
32
julia> bitstring.(msgs)
2-element Vector{String}:
"0011110000000000"
"1000000000000000"
(Getting rid of the unnecessary allocation of the undef bit vector would require some black magic, I belive.)
Let's say I've got a matrix with n columns, and I've got n different functions.
Is it possible to apply i-th function per each element in i-th column efficiently, that is without using loop?
For example for the following variables:
funs = #(x) [x, cos(x), x.^2]
A = [1 0 1
2 0 2
3 0 3
4 0 4] ;
I would like to obtain the following result:
B = [1 1 1
2 1 4
3 1 9
4 1 16] ;
without looping through columns...
I know how to find the number of triangles in an adjacency matrix.
tri = trace(A^3) / 6
But i require to find the nodes so that i can finally find the value of the edges from adjacency matrix since it's a sign graph. Is there already existing function which does that?
Taking the power of the adjacency matrix loses information about the intermediate nodes. Instead of a 2-dimensional matrix, we need 3 dimensions.
Given a graph:
and its adjacency matrix:
A =
0 0 0 0 1 1 0 1 0 0
0 0 0 1 0 1 0 0 0 0
0 0 0 1 0 0 0 1 0 1
0 1 1 0 1 0 1 0 0 0
1 0 0 1 0 0 1 0 0 0
1 1 0 0 0 0 0 1 1 0
0 0 0 1 1 0 0 0 1 0
1 0 1 0 0 1 0 0 0 0
0 0 0 0 0 1 1 0 0 0
0 0 1 0 0 0 0 0 0 0
Compute the 3d matrix T such that T(i,j,k) == 1 iff there is a path in the graph i=>j=>k=>i.
T = and(A, permute(A, [3 1 2]))
This is the equivalent of squaring the adjacency matrix, but keeping the path information. and is used here instead of multiplication in case A is a weighted adjacency matrix. If you sum along the 2nd dimension, you'll get A^2:
>> isequal(squeeze(sum(T,2)), A^2)
ans = 1
Now that we've got the paths of length 2, we just need to filter so we keep only the paths that return to their starting points.
T = and(T, permute(A.', [1 3 2])); % Transpose A in case graph is directed
Now, if T(i,j,k) == 1, then there is a triangle starting at node i, through nodes j and k and returning to node i. If you want to find all such paths:
[M,N,P] = ind2sub(size(T), find(T));
P = [M,N,P];
P will be a list of all triangular paths:
P =
8 6 1
6 8 1
7 5 4
5 7 4
7 4 5
4 7 5
8 1 6
1 8 6
5 4 7
4 5 7
6 1 8
1 6 8
In this case we get 12 paths. All paths in an undirected graph have 6 duplicates: one starting at each triangle point, times 2 directions. This gives the same results as trace:
>> trace(A^3)
ans = 12
If you want to remove the duplicates, the simplest way for triangles is to simply sort the vertex ordering and then take the unique rows of the list. This works for triangles only because all permutations of the nodes in the cycle are present. For longer cycles, this will not work.
P = unique(sort(P, 2), 'rows');
P =
1 6 8
4 5 7
Here is a solution using matrix multiplication:
C = (A * A.') & A;
[x, y] = find(tril(C));
n = numel(x);
D = sparse([x; y], [1:n 1:n].', 1, size(A,1), n);
[X, ~, V] = find(C * D);
tri = [x y X(V == 2)]
tri = unique(sort(tri, 2), 'rows');
First we need to know what are triangle nodes. Two nodes are triangle nodes if they have a common neighbor and both of them are neighbor of each other.
We take the definition to compute an adjacency matrix C that only contains triangle nodes and all other node are removed.
The expression A * A.' selects nodes that have common neighbors and the & A operator says that those nodes that have common neighbors should by neighbor of each other.
Now we can use [x, y] = find(tril(C)); to extract the first and the second points of each triangle as x and y respectively.
For the third node we need to find a node that has x and y as its neighbors. As before we can use the multiplication of boolean matrix trick to speed up the computation.
Finally the result tri has duplicates that should be remove using unique and sort.
let us consider following code for impulse function
function y=impulse_function(n);
y=0;
if n==0
y=1;
end
end
this code
>> n=-2:2;
>> i=1:length(n);
>> f(i)=impulse_function(n(i));
>>
returns result
f
f =
0 0 0 0 0
while this code
>> n=-2:2;
>> for i=1:length(n);
f(i)=impulse_function(n(i));
end
>> f
f =
0 0 1 0 0
in both case i is 1 2 3 4 5,what is different?
Your function is not defined to handle vector input.
Modify your impluse function as follows:
function y=impulse_function(n)
[a b]=size(n);
y=zeros(a,b);
y(n==0)=1;
end
In your definition of impulse_function, whole array is compared to zero and return value is only a single number instead of a vector.
In the first case you are comparing an array to the value 0. This will give the result [0 0 1 0 0], which is not a simple true or false. So the statement y = 0; will not get executed and f will be [0 0 0 0 0] as shown.
In the second you are iterating through the array value by value and passing it to the function. Since the array contains the value 0, then you will get 1 back from the function in the print out of f (or [0 0 1 0 0], which is an impulse).
You'll need to modify your function to take array inputs.
Perhaps this example will clarify the issue further:
cond = 0;
if cond == 0
disp(cond) % This will print 0 since 0 == 0
end
cond = 1;
if cond == 0
disp(cond) % This won't print since since 1 ~= 0 (not equal)
end
cond = [-2 -1 0 1 2];
if cond == 0
disp(cond) % This won't print since since [-2 -1 0 1 2] ~= 0 (not equal)
end
You could define your impulse function simply as this one -
impulse_function = #(n) (1:numel(n)).*n==0
Sample run -
>> n = -6:4
n =
-6 -5 -4 -3 -2 -1 0 1 2 3 4
>> out = impulse_function(n)
out =
0 0 0 0 0 0 1 0 0 0 0
Plot code -
plot(n,out,'o') %// data points
hold on
line([0 0],[1 0]) %// impulse point
Plot result -
You can write an even simpler function:
function y=impulse_function(n);
y = n==0;
Note that this will return y as a type logical array but that should not affect later numerical computations.
I have an array like this:
0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
I want every non-zero elements to expand themselves one element at a time until it reaches other non-zero elements, the result is like this:
1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8
Is there any way to do this using thrust?
Is there any way to do this using thrust?
Yes, here is one possible approach.
For each position in the sequence, compute 2 distances. The first is the distance to the nearest non-zero value in the left direction, and the second is the distance to the nearest non-zero value in the right direction. If the position itself is non-zero, both left and right distances will be computed as zero. Our basic engine for this will be segmented inclusive scans, one computed in the left to right direction (to compute the distance from the left for each zero segment), and the other computed in the reverse direction (to compute the distance from the right for each zero segment). Using your example:
a vector: 0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
a left dist: ? ? ? 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2
a right dist:3 2 1 0 4 3 2 1 0 2 1 0 3 2 1 0 ? ?
Note that in each distance computation, we must special-case one end if that end does not happen to begin with a non-zero value (because the distance from that direction is "undefined"). We will special case those ? distances by assigning them large values, the reason for which will become evident in the next step.
We now will create a "map" vector, which, for each output position, allows us to select an element from the original input vector that belongs in that output position. This map vector is computed by taking the lesser of the two computed distances, and adjusting the index either from the left or the right, by that distance:
output index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
a left dist: ? ? ? 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2
a right dist: 3 2 1 0 4 3 2 1 0 2 1 0 3 2 1 0 ? ?
map vector: 3 3 3 3 3 3 8 8 8 8 11 11 11 11 15 15 15 15
For the map vector computation, if a left dist > a right dist then we take the output index and add a right dist to it, to produce the map vector element at that position. Otherwise, we take the output index and subtract a left dist from it. Note that the special-case ? entries above should be considered to be "arbitrarily large" for this computation. This is simulated in the code by using a large integer (1<<30).
Once we have the map vector, it's a trivial matter to use it to do a mapped copy from input to output vectors:
a vector: 0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
map vector: 3 3 3 3 3 3 8 8 8 8 11 11 11 11 15 15 15 15
out vector: 1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8
Here is a fully worked example:
$ cat t610.cu
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <thrust/scan.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/functional.h>
#include <thrust/transform.h>
#include <thrust/sequence.h>
#include <iostream>
#define IVAL (1<<30)
// used to create input vector for prefix sums (distance vector computation)
struct is_zero {
template <typename T>
__host__ __device__
T operator() (T val) {
return (val) ? 0:1;
}
};
// inc and dec help with special casing of left and right ends
struct inc {
template <typename T>
__host__ __device__
T operator() (T val) {
return val+IVAL;
}
};
struct dec {
template <typename T>
__host__ __device__
T operator() (T val) {
return val-IVAL;
}
};
// this functor is lifted from thrust example code
// and is used to enable segmented scans based on flag delimitors
// BinaryPredicate for the head flag segment representation
// equivalent to thrust::not2(thrust::project2nd<int,int>()));
template <typename HeadFlagType>
struct head_flag_predicate : public thrust::binary_function<HeadFlagType,HeadFlagType,bool>
{
__host__ __device__
bool operator()(HeadFlagType left, HeadFlagType right) const
{
return !right;
}
};
// distance tuple ordering is left (0), then right (1)
struct map_functor
{
template <typename T>
__host__ __device__
int operator() (T dist){
int leftdist = thrust::get<0>(dist);
int rightdist = thrust::get<1>(dist);
int idx = thrust::get<2>(dist);
return (leftdist > rightdist) ? (idx+rightdist):(idx-leftdist);
}
};
int main(){
int h_a[] = { 0, 0, 0, 1, 0, 0, 0, 0, 5, 0, 0, 3, 0, 0, 0, 8, 0, 0 };
int n = sizeof(h_a)/sizeof(h_a[0]);
thrust::device_vector<int> a(h_a, h_a+n);
thrust::device_vector<int> az(n);
thrust::device_vector<int> asl(n);
thrust::device_vector<int> asr(n);
thrust::transform(a.begin(), a.end(), az.begin(), is_zero());
// set up distance from the left vector (asl)
thrust::transform_if(az.begin(), az.begin()+1, a.begin(), az.begin(),inc(), is_zero());
thrust::transform(a.begin(), a.begin()+1, a.begin(), inc());
thrust::inclusive_scan_by_key(a.begin(), a.end(), az.begin(), asl.begin(), head_flag_predicate<int>());
thrust::transform(a.begin(), a.begin()+1, a.begin(), dec());
thrust::transform_if(az.begin(), az.begin()+1, a.begin(), az.begin(), dec(), is_zero());
// set up distance from the right vector (asr)
thrust::device_vector<int> ra(n);
thrust::sequence(ra.begin(), ra.end(), n-1, -1);
thrust::transform_if(az.end()-1, az.end(), a.end()-1, az.end()-1, inc(), is_zero());
thrust::transform(a.end()-1, a.end(), a.end()-1, inc());
thrust::inclusive_scan_by_key(thrust::make_permutation_iterator(a.begin(), ra.begin()), thrust::make_permutation_iterator(a.begin(), ra.end()), thrust::make_permutation_iterator(az.begin(), ra.begin()), thrust::make_permutation_iterator(asr.begin(), ra.begin()), head_flag_predicate<int>());
thrust::transform(a.end()-1, a.end(), a.end()-1, dec());
// create combined map vector
thrust::device_vector<int> map(n);
thrust::counting_iterator<int> idxbegin(0);
thrust::transform(thrust::make_zip_iterator(thrust::make_tuple(asl.begin(), asr.begin(), idxbegin)), thrust::make_zip_iterator(thrust::make_tuple(asl.end(), asr.end(), idxbegin+n)), map.begin(), map_functor());
// use map to create output
thrust::device_vector<int> result(n);
thrust::copy(thrust::make_permutation_iterator(a.begin(), map.begin()), thrust::make_permutation_iterator(a.begin(), map.end()), result.begin());
// display results
std::cout << "Input vector:" << std::endl;
thrust::copy(a.begin(), a.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
std::cout << "Output vector:" << std::endl;
thrust::copy(result.begin(), result.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
}
$ nvcc -arch=sm_20 -o t610 t610.cu
$ ./t610
Input vector:
0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
Output vector:
1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8
$
Notes:
The above implementation probably has areas that can be improved on, particularly with respect to fusion of operations. However, for understanding purposes, I think fusion makes the code a bit harder to read.
I have really only tested it on the particular example you gave. There may be bugs that you will uncover. My purpose is not to give you a black-box library function that you use but don't understand, but rather to teach you how to write your own code that does what you want.
The "ambiguity" pointed out by JackOLantern is still present in your problem statement. I have obscured it by choosing my map functor behavior to mimic the output you indicated as desired, but simply by creating an equally valid but opposite realization of the map functor (using "if a left dist < a right dist then ..." instead) I can cause the result between 3 and 8 to take the other possible outcome/state. Your comment that "if there is an ambiguity, whoever reaches the position first fill its value to that space" makes no sense to me, unless by that you mean "I don't care which outcome you provide." There is no concept of a particular thread reaching a particular point first. Threads (and blocks) can execute in any order, and this order can change from device to device, and run to run.