it's a weird problem that I have
I have a very simple constructor that's creates a matrix with no values:
RegMatrix::RegMatrix(const int numRow, const int numCol):
_numRow(numRow),_numCol(numCol),_matrix()
{
}
_matrix is a vector that holds 'Comlex', an object I've created
and VAL(i,j) is #define VAL(i,j) ((i * _numCol) + j)
Now, I call this constructor in function transpose:
RegMatrix RegMatrix::transpose()
{
RegMatrix newMatrix(_numCol,_numRow);
cout << "DIMENSIONS " << newMatrix._numRow << " " << newMatrix._numCol << endl;
for(int j=0; j<_numCol; j++)
{
for(int i=0; i<_numRow; i++)
{
newMatrix._matrix[VAL(i,j)] = _matrix[VAL(j,i)]; //<--SEGMENTATION FAULT
}
}
return newMatrix;
}
And here's my problem: I get a segmentation fault the very first time I enter the second loop. When I use the eclipse debugger I see that _nunRow and _numCol values of newMatrix seem to be garbage (one is '0' the other is -10000000 or something like that). What's even more weird is that I added the output line just to be sure and it gave me the right numbers!
So, any ideas as to what can be my problem?
Thanks!
You are indexing into an empty vector, which is doomed to fail. Use at instead of the subscript operator and you will get an exception.
My guess (based on what you show) is that there may be some problems with how you implement the copy constructor.
Related
The main problem I'm having is to read out values in binary in C. Python and C# had some really quick/easy functions to do this, I found topic about how to do it in C++, I found topic about how to convert int to binary in C, but not how to convert uint32_t to binary in C.
What I am trying to do is to read bit by bit the 32 bits of the DR_REG_RNG_BASE address of an ESP32 (this is the address where the random values of the Random Hardware Generator of the ESP are stored).
So for the moment I was doing that:
#define DR_REG_RNG_BASE 0x3ff75144
void printBitByBit( ){
// READ_PERI_REG is the ESP32 function to read DR_REG_RNG_BASE
uint32_t rndval = READ_PERI_REG(DR_REG_RNG_BASE);
int i;
for (i = 1; i <= 32; i++){
int mask = 1 << i;
int masked_n = rndval & mask;
int thebit = masked_n >> i;
Serial.printf("%i", thebit);
}
Serial.println("\n");
}
At first I thought it was working well. But in fact it takes me out of binary representations that are totally false. Any ideas?
Your shown code has a number of errors/issues.
First, bit positions for a uint32_t (32-bit unsigned integer) are zero-based – so, they run from 0 thru 31, not from 1 thru 32, as your code assumes. Thus, in your code, you are (effectively) ignoring the lowest bit (bit #0); further, when you do the 1 << i on the last loop (when i == 32), your mask will (most likely) have a value of zero (although that shift is, technically, undefined behaviour for a signed integer, as your code uses), so you'll also drop the highest bit.
Second, your code prints (from left-to-right) the lowest bit first, but you want (presumably) to print the highest bit first, as is normal. So, you should run the loop with the i index starting at 31 and decrement it to zero.
Also, your code mixes and mingles unsigned and signed integer types. This sort of thing is best avoided – so it's better to use uint32_t for the intermediate values used in the loop.
Lastly (as mentioned by Eric in the comments), there is a far simpler way to extract "bit n" from an unsigned integer: just use value >> n & 1.
I don't have access to an Arduino platform but, to demonstrate the points made in the above discussion, here is a standard, console-mode C++ program that compares the output of your code to versions with the aforementioned corrections applied:
#include <iostream>
#include <cstdint>
#include <inttypes.h>
int main()
{
uint32_t test = 0x84FF0048uL;
int i;
// Your code ...
for (i = 1; i <= 32; i++) {
int mask = 1 << i;
int masked_n = test & mask;
int thebit = masked_n >> i;
printf("%i", thebit);
}
printf("\n");
// Corrected limits/order/types ...
for (i = 31; i >= 0; --i) {
uint32_t mask = (uint32_t)(1) << i;
uint32_t masked_n = test & mask;
uint32_t thebit = masked_n >> i;
printf("%"PRIu32, thebit);
}
printf("\n");
// Better ...
for (i = 31; i >= 0; --i) {
printf("%"PRIu32, test >> i & 1);
}
printf("\n");
return 0;
}
The three lines of output (first one wrong, as you know; last two correct) are:
001001000000000111111110010000-10
10000100111111110000000001001000
10000100111111110000000001001000
Notes:
(1) On the use of the funny-looking "%"PRu32 format specifier for printing the uint32_t types, see: printf format specifiers for uint32_t and size_t.
(2) The cast on the (uint32_t)(1) constant will ensure that the bit-shift is safe, even when int and unsigned are 16-bit types; without that, you would get undefined behaviour in such a case.
When you printing out a binary string representation of a number, you print the Most Signification Bit (MSB) first, whether the number is a uint32_t or uint16_t, so you will need to have a mask for detecting whether the MSB is a 1 or 0, so you need a mask of 0x80000000, and shift-down on each iteration.
#define DR_REG_RNG_BASE 0x3ff75144
void printBitByBit( ){
// READ_PERI_REG is the ESP32 function to read DR_REG_RNG_BASE
uint32_t rndval = READ_PERI_REG(DR_REG_RNG_BASE);
Serial.println(rndval, HEX); //print out the value in hex for verification purpose
uint32_t mask = 0x80000000;
for (int i=1; i<32; i++) {
Serial.println((rndval & mask) ? "1" : "0");
mask = (uint32_t) mask >> 1;
}
Serial.println("\n");
}
For Arduino, there are actually a couple of built-in functions that can print out the binary string representation of a number. Serial.print(x, BIN) allows you to specify the number base on the 2nd function argument.
Another function that can achieve the same result is itoa(x, str, base) which is not part of standard ANSI C or C++, but available in Arduino to allow you to convert the number x to a str with number base specified.
char str[33];
itoa(rndval, str, 2);
Serial.println(str);
However, both functions does not pad with leading zero, see the result here:
36E68B6D // rndval in HEX
00110110111001101000101101101101 // print by our function
110110111001101000101101101101 // print by Serial.print(rndval, BIN)
110110111001101000101101101101 // print by itoa(rndval, str, 2)
BTW, Arduino is c++, so don't use c tag for your post. I changed it for you.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I am trying to copy non-zero elements of an array to a different array using pointers. I have tried implementing the solution in thrust copy_if: incomplete type is not allowed but I get zeros in my resultant array. Here is my code:
This is the predicate functor:
struct is_not_zero
{
__host__ __device__
bool operator()( double x)
{
return (x != 0);
}
};
And this is where the copy_if function is used:
double out[5];
thrust::device_ptr<double> output = thrust::device_pointer_cast(out);
double *test1;
thrust::device_ptr<double> gauss_res(hostResults1);
thrust::copy_if(thrust::host,gauss_res, gauss_res+3,output, is_not_zero());
test1 = thrust::raw_pointer_cast(output);
for(int i =0;i<6;i++) {
cout << test1[i] << " the number " << endl;
}
where hostresult1 is the output array from a kernel.
You are making a variety of errors as discussed in the comments, and you've not provided a complete code so its not possible to state what all the errors are that you are making. Generally speaking you appear to be mixing up device and host activity, and pointers. These should generally be kept separate, and treated separately, in algorithms. The exception would be copying from device to host, but this can't be done with thrust::copy and raw pointers. You must use vector iterators or properly decorated thrust device pointers.
Here is a complete example based on what you have shown:
$ cat t66.cu
#include <thrust/copy.h>
#include <iostream>
#include <thrust/device_ptr.h>
struct is_not_zero
{
__host__ __device__
bool operator()( double x)
{
return (x != 0);
}
};
int main(){
const int ds = 5;
double *out, *hostResults1;
cudaMalloc(&out, ds*sizeof(double));
cudaMalloc(&hostResults1, ds*sizeof(double));
cudaMemset(out, 0, ds*sizeof(double));
double test1[ds];
for (int i = 0; i < ds; i++) test1[i] = 1;
test1[3] = 0;
cudaMemcpy(hostResults1, test1, ds*sizeof(double), cudaMemcpyHostToDevice);
thrust::device_ptr<double> output = thrust::device_pointer_cast(out);
thrust::device_ptr<double> gauss_res(hostResults1);
thrust::copy_if(gauss_res, gauss_res+ds,output, is_not_zero());
cudaMemcpy(test1, out, ds*sizeof(double), cudaMemcpyDeviceToHost);
for(int i =0;i<ds;i++) {
std::cout << test1[i] << " the number " << std::endl;
}
}
$ nvcc -o t66 t66.cu
$ ./t66
1 the number
1 the number
1 the number
1 the number
0 the number
First of thanks for giving me a hand with this. I am no expert at C++ but i have done some work in C. My code problem is that it would not display the returned array value correctly.
In general what my program trying to do is to evaluate a function F(x) , display it in a table format and find its min and max. I have find ways of doing all that but when I want to display the returned value of array F(x) it somehow got distorted.The first value is always correct for example like
cout << *(value+0) <<endl;
but the next one the value is not the same as the supposed f(x).Sorry in advance if my code is not up to the proper standard but i been wrapping my head over this for awhile now.
My Full Code
#include <iostream>
#include <fstream>
#include <cmath>
#include <iomanip>
#include <string>
#include <stdlib.h>
using namespace std;
float *evaluate ();
void display ();
void Min_Max(float *);
int main()
{
float *p;
evaluate();
display();
cin.get();
p = evaluate();
Min_Max(p);
return 0;
}
float *evaluate()
{
ofstream Out_File("result.txt");
int n=30;
float x [n];
float fx[n];
float interval = ((4-(-2))/0.2);
x[0]= -2.0;
for(n=0;n <= interval;n++)
{
fx[n] = 4*exp((-x[n])/2)*sin((2*x[n]- 0.3)*3.14159/180);
x[n+1] = x[n] + 0.2;
if (Out_File.is_open())
{
Out_File <<setprecision(5)<<setw(8)<<showpoint<<fixed<< x[n];
Out_File << "\t\t"<<setprecision(5)<<setw(8)<<showpoint<<fixed<<fx[n]<<endl;
}
else cout << "Unable to open file";
}
Out_File.close();
return fx;
}
void display()
{
ifstream inFile;
inFile.open("result.txt");
string line;
cout << " x\t\t\t f(x)"<<endl;
cout << "_______________________________________"<<endl;
while( getline (inFile,line))
{
cout<<line<<endl;
}
inFile.close();
}
void Min_Max(float *value)
{
int a=0;
for(a=0;a<=30;a++){
cout << *(value+a) <<endl;
*value =0;}
}
I see, you pass p to your function Min_Max. Where p is a pointer to an entry point of an array. That array is created as a local variable in another function evaluate. That doesn't work, because as soon as evaluate has finished, all its local variables, such as the fx array, get destroyed and the pointer you return then points to "nothing".
In that case you can use heap memory (use new operator) to allocate the fx. But don't forget to free it afterward.
Also, look here
I'm using c++ but really just need an idea how to do this I should be able to come up with my own code.
I know there are 112 possible combinations, but I'm trying to find a way to generate them all possibly an array without having to do it manually.
it doesn't have to be an array, I can easily make it an array if needed, just need to generate all the binary numbers between 0 and 128 containing exactly 5 on bits.
bool/int bit[8];
whatever works where
bit[7]+bit[6]+bit[5]+bit[4]+bit[3]+bit[2]+bit[1]+bit[0]=5;
bool/int bits[8][112];
trying to figure out how to do this in a loop
I've been googling and haven't found anything close to what I'm trying to do, any help will be greatly appreciated.
I figured it out
#include <iostream>
using namespace std;
int binArray[8]={0,0,0,0,0,0,0,0};
int main(){
for(int i=0;i<112;i++){
do{
binArray[7]++;
for(int b=7;b>-1;b--){
if(binArray[b]==2){
binArray[b]=0;
binArray[b-1]++;
}
}
}
while(binArray[7]+binArray[6]+binArray[5]+binArray[4]+binArray[3]+binArray[2]+binArray[1]+binArray[0]!=5);
for(int j=0;j<8;j++){
cout << binArray[j] << " ";
}
cout << endl;
}
return 0;
}
I am new in CUDA programming and have a strange behaviour.
I have a kernel like this:
__global__ void myKernel (uint64_t *input, int numOfBlocks, uint64_t *state) {
int const t = blockIdx.x * blockDim.x + threadIdx.x;
int i;
for (i = 0; i < numOfBlocks; i++) {
if (t < 32) {
if (t < 8) {
state[t] = state[t] ^ input[t];
}
if (t < 25) {
deviceFunc(device_state); /* will use some printf() */
}
}
}
}
I run this kernel with this parameter:
myKernel<<<1, 32>>>(input, numOfBlocks, state);
If 'numOfBlocks' is equal to 1, it will work fine, I get the result I expect back and the printf() inside the deviceFunc() are in the correct order.
If 'numOfBlocks' is equal to 2, it does not work fine! The result is not that what I expected and the printf() are not in the correct order (I only use printf() from thread 0)!
So, my question is now: The left threads from (32-25) which ARE NOT calling deviceFunc(), will they wait and block and this position or will they run the again and start over with the next for-loop iteration? I always thought that every line in the kernel is synchronized in the same block.
I worked the whole day on this and I finally found a solution. First, you are right that I had in my deviceFunc() many RAW hazards. I started to put some __syncthreads() after any WRITE operation, but I think this slows down my program. And I don't think that __syncthreads() is the common way to resolve them. Funny is, that the result is still the same with and without __syncthreads().
But my problem in my code above is that I used
input[t]
which was wrong, because I had to include 'numOfBlocks' in my calculation of index:
input[(NUM_OF_XOR_THREADS * i) + t)
Now, the result was correct and my problem is solved.