CUDA multiple files

CUDA multiple files - cuda

Is there any way to do like this in CUDA + C++??
class : 1
class1
{
__device__ ....//some cuda code
void ExecuteCuda1(); //this should execute the cuda code in this class
}
class : 2
class2
{
__device__ ....//some cuda code
void ExecuteCuda2(); //this should execute the cuda code in this class
}
class : 3
class3
{
cl1 = new class1();
cl1->ExecuteCuda1();
cl1 = new class2();
cl1->ExecuteCuda2();
}

No, your code won't work, because ExecuteCuda1 and ExecuteCuda2 are __host__ functions, meaning they will execute on the CPU. It's illegal for a __host__ function to call any function marked as __device__, even if they are member functions of a common class.
Structuring your code like this will work:
__global__ kernel1() {...}
class Class1
{
void ExecuteCuda1()
{
// launch kernel1
kernel1<<<...>>>();
}
};
__global__ kernel2() {...}
class Class2
{
void ExecuteCuda2()
{
// launch kernel2
kernel2<<<...>>>();
}
};
class Class3
{
void ExecuteCuda3()
{
Class1 cl1 = new class1();
cl1->ExecuteCuda1();
Class2 cl2 = new class2();
cl2->ExecuteCuda2();
}
};
Note that __global__ function cannot be member functions, even if they are declared static. That's why we must define the kernels outside of any class. As Dan says, the code must be defined within a single translation unit, due to the absence of a CUDA linker. You can achieve this in multiple files by using header files.

Yes, but device code (i.e. functions marked with device or global) must be visible in the same compilation unit where it is needed, i.e. no linking.
So as long as you had the definitions of Class1 and Class2 in headers and not just the declarations then it should work.

Related

'Gdiplus::Graphics' : no appropriate default constructor available

Not sure what I am doing wrong.. I have a very muddy idea of how Constructors should be formatted or structured, so any insights would help!
Renderer.h
#pragma once
#include <afxwin.h>
#include <winapifamily.h>
#include <wtypes.h>
#include <gdiplus.h>
class Renderer
{
public:
Renderer();
~Renderer();
void Clear(Gdiplus::Color clearColor);
virtual void Free() = 0;
virtual void LoadFace(int index, char* path) = 0;
void InitFromHDC(HDC dc);
void Shutdown();
// Drawing surface
Gdiplus::Graphics _graphics;
protected:
private:
bool _gdiplusActive;
};
Renderer.cpp
Renderer::Renderer()
: _gdiplusActive(false)
{ // <-error here
}
Renderer::~Renderer() {}
...
I tried many variation of adding variables... but honestly, the error may be obvious who understands what a default constructor is. I dunno.

A default constructor is a construtor that takes no parameters.
You can't create a Gdiplus::Graphics out of thin air, you need to give it something to draw on: a bitmap, a window or a device context.
Here is the list of constructors available:
https://learn.microsoft.com/en-us/windows/win32/api/gdiplusgraphics/nf-gdiplusgraphics-graphics-graphics(constgraphics_)

When would I need a Virtual Function?

I understand that a Virtual Function is a function that can be redefined in classes that inherit that function.
Yet, I do not understand why I would need a Virtual Function. Can someone explain me or show me cases where I would need Virtual Functions?
Thanks!

There is nice explanation with good example
https://en.wikipedia.org/wiki/Virtual_function

Any function can be redefined in a class' inheritors. The key to virtual functions is that they are supposed to be overriden.
Suppose you have a polygon class (in C++):
class Polygon {
protected:
int width, height;
public:
void set_values (int a, int b)
{ width=a; height=b; }
virtual int area ()
{ return 0; }
};
Now it doesn't make sense to define the Polygon.area function inside the polygon class, because at this level you don't know what the polygon is. The existence of the virtual function enforces all inheritors to implement their own version of the function.

cocos2d-x-3.0 draw vs onDraw

I'm using cocos2d-x v3.0 and in some test project I'm doing some custom drawing by overriding Node's draw method, but in the DrawPrimitives example provided they do something like this:
void DrawPrimitivesTest::draw()
{
_customCommand.init(_globalZOrder);
_customCommand.func = CC_CALLBACK_0(DrawPrimitivesTest::onDraw, this);
Director::getInstance()->getRenderer()->addCommand(&_customCommand);
}
void DrawPrimitivesTest::onDraw()
{
// drawing code here, why?
}
From reading the header and source files it seems like this may be some way of sending render commands straight to the renderer, is that correct?
Should I be using this method to do custom drawing? What's the difference between draw an onDraw?
EDIT:
As #Pedro Soares mentioned, since Cocos2D-X 3.0 you can't override draw() anymore. you have to use draw(Renderer *renderer, const kmMat4 &transform, bool transformUpdated) instead.

There is sample on cocos2d-x RC0 package that shows how to use the DrawPrimitives on top of other layers.
On your Layer .h add the following:
private:
void onDrawPrimitives(const kmMat4 &transform, bool transformUpdated);
CustomCommand _customCommand;
Now in the cpp of the Layer, override the layer draw method and include the onDrawPrimitives method:
void MyLayer::onDrawPrimitives(const kmMat4 &transform, bool transformUpdated)
{
kmGLPushMatrix();
kmGLLoadMatrix(&transform);
//add your primitive drawing code here
DrawPrimitives::drawLine(ccp(0,0), ccp(100, 100));
}
void MyLayer::draw(Renderer *renderer, const kmMat4& transform, bool transformUpdated)
{
_customCommand.init(_globalZOrder);
_customCommand.func = CC_CALLBACK_0(MyLayer::onDrawPrimitives, this, transform, transformUpdated);
renderer->addCommand(&_customCommand);
}

In future, cocos2d-x 3.x renderer will be multithreaded with command pool.
draw method called by visit method, to create new command. When command is performed by command pool, onDraw is called. At this moment, commands are performed in single thread, but in overloaded onDraw method you should assume, that it will be called in another thread to simplify future migration.

I use draw method for debugDraw Like this It may be helpful
void HelloWorld::draw(Renderer *renderer, const Mat4 &transform, uint32_t flags)
{
Layer::draw(renderer, transform, flags);
Director* director = Director::getInstance();
GL::enableVertexAttribs(GL::VERTEX_ATTRIB_FLAG_POSITION );
director->pushMatrix(MATRIX_STACK_TYPE::MATRIX_STACK_MODELVIEW);
world->DrawDebugData();
director->popMatrix(MATRIX_STACK_TYPE::MATRIX_STACK_MODELVIEW);
}

The draw() expression should be the same as the base class function.
The draw method of Node for cocos 3.3rc is:
virtual void draw(Renderer *renderer, const Mat4& transform, uint32_t flags);

Proper use of cudaDeviceReset()

Since I'm having suspicions the "black box" (GPU) is not shutting down cleanly in some larger code (others perhaps too), I would include a cudaDeviceReset() at the end of main(). But wait! This would Segmentation fault all instances of classes statically created in main() with non-trivial CUDA code in destructors, right? E.g.
class A {
public:
cudaEvent_t tt;
cudaEvent_t uu;
A() {
cudaEventCreate(&tt);
cudaEventCreate(&uu);
}
~A(){
cudaEventDestroy(tt);
cudaEventDestroy(uu);
}
};
instantiated statically:
int main() {
A t;
cudaDeviceReset();
return 0;
}
segfaults on exit. Question: is perhaps cudaDeviceReset() invoked automatically on exit from main()?
Otherwise whole useful code of main() should be shifted to some run(), and cudaDeviceReset() should be the as last command in main(), right?

As indicated by Talonmies, the destructor of class A is called after the cudaDeviceReset() function is already called, namely when the main(..) function finishes.
I think, you may take cudaDeviceReset() to an atexit(..) function.
void myexit() {
cudaDeviceReset();
}
int main(...) {
atexit(myexit);
A t;
return 0;
}

code for optimized watch

i want to implement a code to keep a watch on suppose some event ...at the meantime i don have any inbuilt eventwatcher so i hv to implement one of mine..which consumes least cpu & memory.
can u suggest me one..
for example a pseudocode is given:
while(true)
{
if(process.isrunning)
process.kill();
}

If you don't have any event to hook into, then your code has to be "active" to run the checks. And that costs CPU cycles.
What you can to do ease waste is to add a call to sleep (Thread.Sleep in .NET, sleep in some implementations of C++).
while (true) {
if(process.isrunning)
process.kill();
sleep(100); // Wait 100 millisecond before trying again
}
But that will make you code a little less responsive.

you can try using timer queue : http://msdn.microsoft.com/en-us/library/ms687003%28VS.85%29.aspx its basically using kernel scheduler to call your function callback at specified interval, the caller is from different thread so it won't interrupt the main thread and make your application responsive, the thread is managed by Windows so you don't have to manage your own pooling thread, and its relative accurate.
implementation example:
`
//a singleton class that hold timer queue
class TimerQueue {
protected:
HANDLE timerQueue;
TimerQueue() {
this->timerQueue = ::CreateTimerQueue();
}
~TimerQueue() {
if(this->timerQueue) {
::DeleteTimerQueueEx(this->timerQueue,NULL);
this->timerQueue = NULL;
}
}
public:
static HANDLE getHandle() {
static TimerQueue timerQueueSingleton;
return timerQueueSingleton.timerQueue;
}
}
//timer base class
class Timer
{
protected:
HANDLE timer;
virtual void timerProc() = 0;
static void CALLBACK timerCallback(PVOID param,BOOLEAN timerOrWait) {
Timer* self = (Timer*)param;
self->timerProc();
}
public:
Timer(DWORD startTimeMs,DWORD periodTimeMs) {
if(!::CreateTimerQueueTimer( &this->timer, TimerQueue::getHandle(),
(WAITORTIMERCALLBACK)&this->timerCallback,
this, startTimeMs, periodTimeMs,
WT_EXECUTEDEFAULT) ) {
this->timer = NULL;
}
}
virtual ~Timer() {
if(this->timer) {
::DeleteTimerQueueTimer(TimerQueue::getHandle(),&this->timer,NULL);
this->timer = NULL;
}
}
}
//derive and implement timerProc
class MyTimer : public Timer
{
protected:
virtual void timerProc() {
if(process.isRunning()) {
process.kill();
}
}
public:
MyTimer(DWORD startTimeMs,DWORD periodTimeMs)
: Timer(startTimeMs,periodTimeMs) {}
}
//usage:
int main(int argc,char* argv[]) {
MyTimer timer(0,100); //start immediately, at 10 Hz interval
}
`
disclaimer : i don't test or compile those codes, you should recheck it

Although you've tagged this as language-agnostic, any good implementation is going to vary widely not just from one language to another, but across operating systems. There are plenty of circumstances where programs or operating system functions need to do just this sort of thing, and mechanisms will have been implemented to do this in as sensible, non-intrusive a way as possible.
If you have a particular language and/or operating system in mind, please tell us, and give us a better idea of what you're trying to achieve. That way we can point you towards the most appropriate of the many possible solutions to this problem.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

CUDA multiple files - cuda

Yes, but device code (i.e. functions marked with device or global) must be visible in the same compilation unit where it is needed, i.e. no linking. So as long as you had the definitions of Class1 and Class2 in headers and not just the declarations then it should work.

Related

'Gdiplus::Graphics' : no appropriate default constructor available

When would I need a Virtual Function?

cocos2d-x-3.0 draw vs onDraw

Proper use of cudaDeviceReset()

code for optimized watch

Categories

Resources