How to Initialize values of 2D array in constructor? - constructor

I have a problem in initializing values of 2D array in a constructor. I am creating a Tic-Tac-Toe game. I am to required to initialize values from the 2D array so that the values is displayed on the board. I am not sure on the way to do it.
I used a nested for loop to initialize values but it didn't work.
class TicTacToe
{
private:
char matrix[3][3];
public:
TicTacToe();
TicTacToe(char matrix[3][3]);
TicTacToe::TicTacToe(char matrix[3][3])
{
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
cout << matrix[i][j] << " ";
}
cout << endl;
}
}
I expect the values from the array to be displayed on the board.

Related

What is the behavior of the vector here?

I am unable to comprehend why in the test shown below, iterator p never reaches the end and therefore the loop breaks only when k = 20? What exactly is the push_back doing to cause undefined behavior? Is it because the vector dynamically allocated a bunch of additional storage for the new elements I want to use, and the amount is not necessarily the amount I will use?
#include <iostream>
#include <vector>
#include <list>
using namespace std;
const int MAGIC = 11223344;
void test()
{
bool allValid = true;
int k = 0;
vector<int> v2(5, MAGIC);
k = 0;
for (vector<int>::iterator p = v2.begin(); p != v2.end(); p++, k++)
{
if (k >= 20) // prevent infinite loop
break;
if (*p != MAGIC)
{
cout << "Item# " << k << " is " << *p << ", not " << MAGIC <<"!" << endl;
allValid = false;
}
if (k == 2)
{
for (int i = 0; i < 5; i++)
v2.push_back(MAGIC);
}
}
if (allValid && k == 10)
cout << "Passed test 3" << endl;
else
cout << "Failed test 3" << "\n" << k << endl;
}
int main()
{
test();
}
Insertion to a vector while iterating over it is really a bad idea. Data insertion may cause memory reallocation that invalidates iterators. In this case, the capacity was not enough to insert additional elements, which caused memory allocation with a different address. You can check it yourself:
void test()
{
bool allValid = true;
int k = 0;
vector<int> v2(5, MAGIC);
k = 0;
for (vector<int>::iterator p = v2.begin(); p != v2.end(); p++, k++)
{
cout << v2.capacity() << endl; // Print the vector capacity
if (k >= 20) // prevent infinite loop
break;
if (*p != MAGIC) {
//cout << "Item# " << k << " is " << *p << ", not " << MAGIC <<"!" << endl;
allValid = false;
}
if (k == 2) {
for (int i = 0; i < 5; i++)
v2.push_back(MAGIC);
}
}
if (allValid && k == 10)
cout << "Passed test 3" << endl;
else
cout << "Failed test 3" << "\n" << k << endl;
}
This code will output something like the following:
5
5
5
10 <-- the capacity has changed
10
... skipped ...
10
10
Failed test 3
20
We can see that where k is equal to 2 (third line), the capacity of the vector doubled (fourth line) because we are adding new elements. The memory is redistributed, and the vector elements are most likely now located elsewhere. You can also check it by printing vector base address with data member function instead of capacity:
Address: 0x136dc20 k: 0
Address: 0x136dc20 k: 1
Address: 0x136dc20 k: 2
Address: 0x136e050 k: 3 <-- the address has changed
Address: 0x136e050 k: 4
... skipped ...
Address: 0x136e050 k: 19
Address: 0x136e050 k: 20
Failed test 3
20
The code is poorly written, you can make it more robust by using indices instead of iterators.

Share pointer between two classes in CUDA

I would like to create a vertices and edge structure with CUDA.
I have two classes.
Connection {
public:
float value;
Connection()
{
this->value = 0;
}
}
Node
{
public:
Connection *incoming;
Connection *outgoing;
int lenIncoming;
int lenOutgoing;
node(Connection *incoming, Connection *outgoing, int lenIncoming, int lenOutgoing)
{
this->incoming = incoming;
this->outgoing = outgoing;
this->lenIncoming = lenIncoming;
this->lenOutgoing = lenOutgoing;
}
}
When I "connect" the nodes, I do the following:
Connection XA = Connection(10);
Connection AB = Connection(2);
Connection XB = Connection(10);
Connection BX = Connection(2);
Connection* incomingA;
Connection* outgoingA;
Connection* ingoingB;
Connection* outgoingB;
cudaMallocManaged(&incomingA, 1 * sizeof(Connection*));
cudaMallocManaged(&outgoingA, 1 * sizeof(Connection*));
cudaMallocManaged(&ingoingB, 2 * sizeof(Connection*));
cudaMallocManaged(&outgoingB, 1 * sizeof(Connection*));
incomingA[0] = XA;
outgoingA[0] = AB;
incomingB[0] = XB;
incomingB[1] = AB;
outgoingB[0]= BX;
Node nodeA = Node(incomingA, outgoingA);
Node nodeB = Node(incomingB, outgoingB);
The thing I would like to happen is when I change the value of nodaA->outgoing[0].value from within a method in Node, it should impact nodaB.incoming[1].value, however that is not the case.
When I change the value from within nodeA, it remains the starting value in nodeB. I thought since I passed a copy of the pointer to the object, I would mean that it updated the original object, however it seems I am mistaken, or I have made some error along the way.
Any suggestions on how this should be done, will be greatly appreciated.
(BTW; The reason I use a class Connection instead of just Floats, is that in the future it will include more)
The classes are created on host.
Node has a method called run, which is running on the device.
__device__ __host__
run()
{
for(int i=0; i<this->lenIncoming; i++)
{
this->incoming[i].value += 1;
}
for(int i=0; i< this->lenOutgoing; i++)
{
this->outgoing[i].value += 2;
}
}
Which in turn is called from a kernel
__global__
void kernel_run(node *nodes)
{
node[0].run();
node[1].run();
};
The kernel is launched by running
kernel_run<<<1, 1>> > (nodes);
I can see that the value changes locally within nodeA, when debugging with Nsight.
As you have already mentioned, the problem is that the objects AB, XB, BX, etc. are being assigned by value rather than by reference, so copies are made of each object each time it is used (i.e. each time it is assigned to an incoming or outgoing connection), and the update to AB from one operation does not affect any other instance of AB.
One possible solution is to make all of your objects "singletons" and refer to them by reference. To make this work on both host and device we will allocate for these objects using cudaMallocManaged. Here's an example:
$ cat t1494.cu
#include <iostream>
class Connection {
public:
float value;
Connection()
{
this->value = 0;
}
Connection(float val)
{
this->value = val;
}
};
class Node
{
public:
Connection **incoming;
Connection **outgoing;
int lenIncoming;
int lenOutgoing;
Node(Connection **incoming, Connection **outgoing, int lenIncoming, int lenOutgoing)
{
this->incoming = incoming;
this->outgoing = outgoing;
this->lenIncoming = lenIncoming;
this->lenOutgoing = lenOutgoing;
}
__device__ __host__
void run()
{
for(int i=0; i<this->lenIncoming; i++)
{
this->incoming[i]->value += 1;
}
for(int i=0; i< this->lenOutgoing; i++)
{
this->outgoing[i]->value += 2;
}
}
};
__global__
void kernel_run(Node *nodes)
{
nodes[0].run();
nodes[1].run();
};
int main(){
Connection *XA;
cudaMallocManaged(&XA, sizeof(Connection));
*XA = Connection(10);
Connection *AB;
cudaMallocManaged(&AB, sizeof(Connection));
*AB = Connection(2);
Connection *XB;
cudaMallocManaged(&XB, sizeof(Connection));
*XB = Connection(10);
Connection *BX;
cudaMallocManaged(&BX, sizeof(Connection));
*BX = Connection(2);
Connection ** incomingA;
Connection ** outgoingA;
Connection ** incomingB;
Connection ** outgoingB;
cudaMallocManaged(&incomingA, 1 * sizeof(Connection*));
cudaMallocManaged(&outgoingA, 1 * sizeof(Connection*));
cudaMallocManaged(&incomingB, 2 * sizeof(Connection*));
cudaMallocManaged(&outgoingB, 1 * sizeof(Connection*));
incomingA[0] = XA;
outgoingA[0] = AB;
incomingB[0] = XB;
incomingB[1] = AB;
outgoingB[0]= BX;
Node *nodes;
cudaMallocManaged(&nodes, 2 * sizeof(Node));
nodes[0] = Node(incomingA, outgoingA, 1, 1);
nodes[1] = Node(incomingB, outgoingB, 2, 1);
std::cout << nodes[0].incoming[0]->value << std::endl;
std::cout << nodes[0].outgoing[0]->value << std::endl;
std::cout << nodes[1].incoming[0]->value << std::endl;
std::cout << nodes[1].incoming[1]->value << std::endl;
std::cout << nodes[1].outgoing[0]->value << std::endl;
kernel_run<<<1, 1>> > (nodes);
cudaDeviceSynchronize();
std::cout << nodes[0].incoming[0]->value << std::endl;
std::cout << nodes[0].outgoing[0]->value << std::endl;
std::cout << nodes[1].incoming[0]->value << std::endl;
std::cout << nodes[1].incoming[1]->value << std::endl;
std::cout << nodes[1].outgoing[0]->value << std::endl;
}
$ nvcc -o t1494 t1494.cu
$ cuda-memcheck ./t1494
========= CUDA-MEMCHECK
10
2
10
2
2
11
5
11
5
4
========= ERROR SUMMARY: 0 errors
$
Note that this system works fine for updating these objects from a single thread. It is not guaranteed to work correctly if you update an object from separate CUDA threads. CUDA does not automatically sort out that kind of multi-thread concurrent access for you. It may be possible to use atomics or some other method, however.
Note that my objective has been to address the original design presented and identify a relatively minor design modification that would meet the stated request. I'm not intending to make any statements about the relative performance merits of this approach, or the suitability of this or any other approach for graph traversal algorithms.

What is the best way to make these function into a class? Or multiple classes?

I am looking for any advice on how to turn a function into a class. I will enter a program below. It is long. I feel i should place the whole thing in for context. I need to rewrite it so that i will use classes in place of the three functions.
#include <iostream>
#include <string>
using namespace std;
// Do not change these function prototypes:
void readBig(int[]);
void printBig(int[]);
void addBig(int[], int[], int[]);
// This constant should be 100 when the program is finished.
const int MAX_DIGITS = 100;
int main()
{
// Declare the three numbers, the first, second and the sum:
int number1[MAX_DIGITS], number2[MAX_DIGITS], sum[MAX_DIGITS];
bool finished = false;
char response;
while (! finished)
{
cout << "Please enter a number up to " << MAX_DIGITS << " digits: ";
readBig(number1);
cout << "Please enter a number up to " << MAX_DIGITS << " digits: ";
readBig(number2);
addBig(number1, number2, sum);
printBig(number1);
cout << "\n+\n";
printBig(number2);
cout << "\n=\n";
printBig(sum);
cout << "\n";
cout << "test again?";
cin>>response;
cin.ignore(900,'\n');
finished = toupper(response)!= 'Y';
}
return 0;
}
//ReadBig will read a number as a string,
//It then converts each element of the string to an integer and stores it in an integer array.
//Finally, it reverses the elements of the array so that the ones digit is in element zero,
//the tens digit is in element 1, the hundreds digit is in element 2, etc.
//AddBig adds the corresponding digits of the first two arrays and stores the answer in the third.
//In a second loop, it performs the carry operation.
//PrintBig uses a while loop to skip leading zeros and then uses a for loop to print the number.
//FUNCTIONS GO BELOW
void readBig(int number[MAX_DIGITS])
{
string read="";
cin>>read;
int len,i, save=0;
len= read.length();
while(i<MAX_DIGITS){
number[i]=0;
i++;
}
for (i=0; i <= len-1; i++){
number[i] = int (read.at(i)-'0');
}
for (i=0;i<=len/2-1;i++){
save=number[i];
number[i]=number[len-1-i];
number[len-1-i]=save;
}
}
void printBig(int number[MAX_DIGITS])
{
int digit=MAX_DIGITS-1;
while(number[digit]==0){
digit--;
}
for (int i=digit; i>=0; i--)
{cout<<number[i];
}
}
void addBig(int number1[MAX_DIGITS], int number2[MAX_DIGITS], int sum[MAX_DIGITS])
{
// The code below sums the arrays.
for (int i = MAX_DIGITS - 1; i >= 0; i--)
{
sum[i] = number1[i] + number2[i];
if (sum[i] > 9 && i < MAX_DIGITS - 1)
{
sum[i + 1] += 1;
sum[i] -= 10;
}
}
}

how to cast thrust::device_vector<int> to function with raw pointer

__global__ void HYPER (int tFast, int tFastLenth, int kilo, int lenPrzvFast, double eps, int AF,double *arrINTLighFast, int *arrPrzvFncFst, int dv_ptr)
{
for(int j = 0;j<(tFast*tFastLenth);j++)
{ arrINTLighFast[j]=0;
}
for(int j = 0;j<(kilo);j++) arrPrzvFncFst[j]=0;
for(int j = 1;j<(tFast*tFastLenth);j++)
{ arrINTLighFast[j]= arrINTLighFast[j-1] + AF*exp(-j/(eps+tFast) ); }
for(int j = 0;j<(tFast*tFastLenth-1);j++)
{
for(int i=(arrINTLighFast[j]);i< (arrINTLighFast[j+1]);i++)
{arrPrzvFncFst[i]=j;}
}
for(int j = 0;j<lenPrzvFast;j++)
{ devvecPrzvFncFst61440Slow149998[j]= arrPrzvFncFst[j] ;}
}
int main (void)
{
const int tFast = 9;
const int tFastLenth = 6;
double arrINTLighFast[tFast*tFastLenth];
int arrPrzvFncFst[61500];
int AF = 1000;
int kilo = 1024;
int kilo150 = 149998;
const double eps=0.0000001;
const int lenPrzvFast=61500;
thrust::host_vector<int> vecPrzvFncFst61440Slow149998;
int Len_vecPrzv=( lenPrzvFast+kilo150);
for (int j=0;j<Len_vecPrzv;j++) vecPrzvFncFst61440Slow149998.push_back(0);
for (int j=0;j<Len_vecPrzv;j++) vecPrzvFncFst61440Slow149998 [j] = 0;
thrust::device_vector<int> devvecPrzvFncFst61440Slow149998 = vecPrzvFncFst61440Slow149998;
int *dv_ptr = thrust::raw_pointer_cast(devvecPrzvFncFst61440Slow149998.data());
HYPER <<<blocks, threads>>>(tFast, tFastLenth, kilo, lenPrzvFast, eps, AF, arrINTLighFast, arrPrzvFncFst, dv_ptr);
thrust::host_vector<int> HostvecPrzvFncFst61440Slow149998 = devvecPrzvFncFst61440Slow149998;
std::cout << "Device vector is: " << std::endl;
for(int j = 0; j<vecPrzvFncFst61440Slow149998.size(); j++)
std::cout << "vecPrzvFncFst61440Slow149998[" << j << "] = " << HostvecPrzvFncFst61440Slow149998[j] << std::endl;
return 0;
}
There is a problem I cannot use vectors in function, so I decided to use thrust::raw_pointer_cast. However, I have problems: during compiling I have an error : identifier "devvecPrzvFncFst61440Slow149998" is undefined. The second one is I cannot defenitely find out how to pass int *dv_ptr to both function and prototype, there is an error: argument of type "int *" is incompatible with parameter of type "int". I looked among the internet but there is no solution how successfully solve the problems I've mentioned above
Thank you for your time
Your kernel function HYPER has no defined parameter of devvecPrzvFncFst61440Slow149998, so when you try to use that here:
for(int j = 0;j<lenPrzvFast;j++)
{ devvecPrzvFncFst61440Slow149998[j]= arrPrzvFncFst[j] ;}
you're going to get the undefined identifier error. There's no magic here, your CUDA kernel mostly has to comply with the rules of an ordinary C function. If you want to use a variable, it had better be listed in the function parameters (excepting global scope variables and built-in variables, which this isn't).
The other problem you mention is due to the fact that dv_ptr is a pointer type:
int *dv_ptr = thrust::raw_pointer_cast(devvecPrzvFncFst61440Slow149998.data());
but you are attempting to pass it in a kernel parameter position:
HYPER <<<blocks, threads>>>(..., dv_ptr);
^^^^^^
that is expecting an ordinary (non-pointer) type:
__global__ void HYPER (..., int dv_ptr)
^^^^^^^^^^
The following code has those issues fixed and compiles cleanly for me:
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#define blocks 1
#define threads 1
__global__ void HYPER (int tFast, int tFastLenth, int kilo, int lenPrzvFast, double eps, int AF,double *arrINTLighFast, int *arrPrzvFncFst, int *dv_ptr)
{
for(int j = 0;j<(tFast*tFastLenth);j++)
{ arrINTLighFast[j]=0;
}
for(int j = 0;j<(kilo);j++) arrPrzvFncFst[j]=0;
for(int j = 1;j<(tFast*tFastLenth);j++)
{ arrINTLighFast[j]= arrINTLighFast[j-1] + AF*exp(-j/(eps+tFast) ); }
for(int j = 0;j<(tFast*tFastLenth-1);j++)
{
for(int i=(arrINTLighFast[j]);i< (arrINTLighFast[j+1]);i++)
{arrPrzvFncFst[i]=j;}
}
for(int j = 0;j<lenPrzvFast;j++)
{ dv_ptr[j]= arrPrzvFncFst[j] ;}
}
int main (void)
{
const int tFast = 9;
const int tFastLenth = 6;
double arrINTLighFast[tFast*tFastLenth];
int arrPrzvFncFst[61500];
int AF = 1000;
int kilo = 1024;
int kilo150 = 149998;
const double eps=0.0000001;
const int lenPrzvFast=61500;
thrust::host_vector<int> vecPrzvFncFst61440Slow149998;
int Len_vecPrzv=( lenPrzvFast+kilo150);
for (int j=0;j<Len_vecPrzv;j++) vecPrzvFncFst61440Slow149998.push_back(0);
for (int j=0;j<Len_vecPrzv;j++) vecPrzvFncFst61440Slow149998 [j] = 0;
thrust::device_vector<int> devvecPrzvFncFst61440Slow149998 = vecPrzvFncFst61440Slow149998;
int *dv_ptr = thrust::raw_pointer_cast(devvecPrzvFncFst61440Slow149998.data());
HYPER <<<blocks, threads>>>(tFast, tFastLenth, kilo, lenPrzvFast, eps, AF, arrINTLighFast, arrPrzvFncFst, dv_ptr);
thrust::host_vector<int> HostvecPrzvFncFst61440Slow149998 = devvecPrzvFncFst61440Slow149998;
std::cout << "Device vector is: " << std::endl;
for(int j = 0; j<vecPrzvFncFst61440Slow149998.size(); j++)
std::cout << "vecPrzvFncFst61440Slow149998[" << j << "] = " << HostvecPrzvFncFst61440Slow149998[j] << std::endl;
return 0;
}

FFTW - computing the IFFT without first computing an FFT

This may seem like a simple question but I've been trying to find the answer on the FFTW page and I am unable to.
I created the FFTW plans for forward and backward transforms and I fed some data into the fftw_complex *fft structure directly (instead of computing an FFT from input data first). Then I compute an IFFT on this and the result is not correct. Am I doing this right?
EDIT: So what I did is the following:
int ht=2, wd=2;
fftw_complex *inp = fftw_alloc_complex(ht * wd);
fftw_complex *fft = fftw_alloc_complex(ht * wd);
fftw_complex *ifft = fftw_alloc_complex(ht * wd);
fftw_plan plan_f = fftw_plan_dft_1d(wd *ht, inp, fft, FFTW_FORWARD, FFTW_ESTIMATE);
fftw_plan plan_b = fftw_plan_dft_1d(wd * ht, fft, ifft, FFTW_BACKWARD, FFTW_ESTIMATE );
for(int i =0 ; i < 2; i++)
{
for(int j = 0; j<2; j++)
{
inp[wd*i + j][0] = 1.0;
inp[wd*i + j][1] = 0.0;
}
}
// fftw_execute(plan_f);
for(int i =0 ; i < 2; i++)
{
for(int j = 0; j<2; j++)
{
fft[wd*i + j][1] = 0.0;
if(i == j == 0)
fft[wd*i+j][0] = 4.0;
else
fft[wd*i+j][0] = 0.0;
std::cout << fft[wd*i+j][0] << " and " << fft[wd*i+j][1] << std::endl;
}
}
fftw_execute(plan_b);
for(int i =0 ; i < 2; i++)
{
for(int j = 0; j<2; j++)
std::cout << ifft[wd*i+j][0]/(double)(wd*ht) << " and " << ifft[wd*i+j][1]/(double)(wd*ht) << std::endl;
}
This is the full code. The ifft should return [1 1 1 1] for the real part. It doesn't.
I did the stupidest thing - in the if condition I posted:
i == j == 0
instead of i ==0 && j == 0
when I fixed that, it works. Thank you all so much for helping me out.