Proper use of cudaDeviceReset() - cuda

Since I'm having suspicions the "black box" (GPU) is not shutting down cleanly in some larger code (others perhaps too), I would include a cudaDeviceReset() at the end of main(). But wait! This would Segmentation fault all instances of classes statically created in main() with non-trivial CUDA code in destructors, right? E.g.
class A {
public:
cudaEvent_t tt;
cudaEvent_t uu;
A() {
cudaEventCreate(&tt);
cudaEventCreate(&uu);
}
~A(){
cudaEventDestroy(tt);
cudaEventDestroy(uu);
}
};
instantiated statically:
int main() {
A t;
cudaDeviceReset();
return 0;
}
segfaults on exit. Question: is perhaps cudaDeviceReset() invoked automatically on exit from main()?
Otherwise whole useful code of main() should be shifted to some run(), and cudaDeviceReset() should be the as last command in main(), right?

As indicated by Talonmies, the destructor of class A is called after the cudaDeviceReset() function is already called, namely when the main(..) function finishes.
I think, you may take cudaDeviceReset() to an atexit(..) function.
void myexit() {
cudaDeviceReset();
}
int main(...) {
atexit(myexit);
A t;
return 0;
}

Related

STM32F407 Stucks on UART Interrupt

I got a problem with my STM32F407 and the UART.
I want to receive data with USART3 in interrupt mode.
my primary code is that:
UART_HandleTypeDef huart3;
uint8_t buffer[16];
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
if(huart == &huart3)
{
if(HAL_UART_Receive_IT(&huart3, uartData, sizeof(uartData)) != HAL_OK)
{
uint32_t error = HAL_UART_GetError(&huart3);
}
}
}
void HAL_UART_ErrorCallback(UART_HandleTypeDef *huart)
{
uint32_t error = huart->ErrorCode;
}
int main(void)
{
HAL_Init();
SystemClock_Config();
MX_GPIO_Init()
MX_USART3_UART_Init();
MX_NVIC_Init();
HAL_UART_Receive_IT(&huart3, uartData, sizeof(uartData));
while (1)
{
osDelay(1);
}
}
i send over my serial console the text 'test' until the buffer is full and interrupt is fired.
my interrupt is fired one time and then no more, HAL_UART_Receive_IT gives me HAL_OK back, but the
ErrorCallback is fired after the with HAL_UART_RxCpltCallback with error 0x8, this says that are a UART Overrun Error...
Okay.. i see no blocking function that can be overrun the UART, or is my function wrong?
it gives many examples with this code to use the UART, who is the problem?
Thank you!
Daniel

'Gdiplus::Graphics' : no appropriate default constructor available

Not sure what I am doing wrong.. I have a very muddy idea of how Constructors should be formatted or structured, so any insights would help!
Renderer.h
#pragma once
#include <afxwin.h>
#include <winapifamily.h>
#include <wtypes.h>
#include <gdiplus.h>
class Renderer
{
public:
Renderer();
~Renderer();
void Clear(Gdiplus::Color clearColor);
virtual void Free() = 0;
virtual void LoadFace(int index, char* path) = 0;
void InitFromHDC(HDC dc);
void Shutdown();
// Drawing surface
Gdiplus::Graphics _graphics;
protected:
private:
bool _gdiplusActive;
};
Renderer.cpp
Renderer::Renderer()
: _gdiplusActive(false)
{ // <-error here
}
Renderer::~Renderer() {}
...
I tried many variation of adding variables... but honestly, the error may be obvious who understands what a default constructor is. I dunno.
A default constructor is a construtor that takes no parameters.
You can't create a Gdiplus::Graphics out of thin air, you need to give it something to draw on: a bitmap, a window or a device context.
Here is the list of constructors available:
https://learn.microsoft.com/en-us/windows/win32/api/gdiplusgraphics/nf-gdiplusgraphics-graphics-graphics(constgraphics_)

passing function with parameters to another function

is there a way to pass a function to another function where this passed function would be called inside the second function with the parameters given while passing it. However, the function passed can have different parameters.
I do not want to run the function with data from inside the calling function, just the parameters i passed when calling the function that does the actual call. Basically a periodic check to see if it's ok to continue.
An example of what i'm trying to do:
bool CheckSomething1 (int a, int b) { /* some code */ }
bool CheckSomething2 (int a, int b, int c) { /* some code */ }
bool WaitForTrue ( something funct something.. )
{
while (! funct) {
/* do some work */
}
}
int _tmain(int argc, _TCHAR* argv[])
{
WaitForTrue(CheckSomething1(1, 2));
WaitForTrue(CheckSomething2(1, 2, 3));
return 0;
}
Edit: Basically i'm looking for a way to pass a function with a different number of parameters that can be different types to another function and run it, but the parameters called can be different each time when passing them, but stay the same while in the function to which it is passed.
What you're looking for is std::bind.
See the link above for a brief example of how it works.
Another way you may implement this kind of control style (which is already popular in functional programming, see CPS) is via currying, although it's not-so-easy to express in C++.

CUDA multiple files

Is there any way to do like this in CUDA + C++??
class : 1
class1
{
__device__ ....//some cuda code
void ExecuteCuda1(); //this should execute the cuda code in this class
}
class : 2
class2
{
__device__ ....//some cuda code
void ExecuteCuda2(); //this should execute the cuda code in this class
}
class : 3
class3
{
cl1 = new class1();
cl1->ExecuteCuda1();
cl1 = new class2();
cl1->ExecuteCuda2();
}
No, your code won't work, because ExecuteCuda1 and ExecuteCuda2 are __host__ functions, meaning they will execute on the CPU. It's illegal for a __host__ function to call any function marked as __device__, even if they are member functions of a common class.
Structuring your code like this will work:
__global__ kernel1() {...}
class Class1
{
void ExecuteCuda1()
{
// launch kernel1
kernel1<<<...>>>();
}
};
__global__ kernel2() {...}
class Class2
{
void ExecuteCuda2()
{
// launch kernel2
kernel2<<<...>>>();
}
};
class Class3
{
void ExecuteCuda3()
{
Class1 cl1 = new class1();
cl1->ExecuteCuda1();
Class2 cl2 = new class2();
cl2->ExecuteCuda2();
}
};
Note that __global__ function cannot be member functions, even if they are declared static. That's why we must define the kernels outside of any class. As Dan says, the code must be defined within a single translation unit, due to the absence of a CUDA linker. You can achieve this in multiple files by using header files.
Yes, but device code (i.e. functions marked with device or global) must be visible in the same compilation unit where it is needed, i.e. no linking.
So as long as you had the definitions of Class1 and Class2 in headers and not just the declarations then it should work.

code for optimized watch

i want to implement a code to keep a watch on suppose some event ...at the meantime i don have any inbuilt eventwatcher so i hv to implement one of mine..which consumes least cpu & memory.
can u suggest me one..
for example a pseudocode is given:
while(true)
{
if(process.isrunning)
process.kill();
}
If you don't have any event to hook into, then your code has to be "active" to run the checks. And that costs CPU cycles.
What you can to do ease waste is to add a call to sleep (Thread.Sleep in .NET, sleep in some implementations of C++).
while (true) {
if(process.isrunning)
process.kill();
sleep(100); // Wait 100 millisecond before trying again
}
But that will make you code a little less responsive.
you can try using timer queue : http://msdn.microsoft.com/en-us/library/ms687003%28VS.85%29.aspx its basically using kernel scheduler to call your function callback at specified interval, the caller is from different thread so it won't interrupt the main thread and make your application responsive, the thread is managed by Windows so you don't have to manage your own pooling thread, and its relative accurate.
implementation example:
`
//a singleton class that hold timer queue
class TimerQueue {
protected:
HANDLE timerQueue;
TimerQueue() {
this->timerQueue = ::CreateTimerQueue();
}
~TimerQueue() {
if(this->timerQueue) {
::DeleteTimerQueueEx(this->timerQueue,NULL);
this->timerQueue = NULL;
}
}
public:
static HANDLE getHandle() {
static TimerQueue timerQueueSingleton;
return timerQueueSingleton.timerQueue;
}
}
//timer base class
class Timer
{
protected:
HANDLE timer;
virtual void timerProc() = 0;
static void CALLBACK timerCallback(PVOID param,BOOLEAN timerOrWait) {
Timer* self = (Timer*)param;
self->timerProc();
}
public:
Timer(DWORD startTimeMs,DWORD periodTimeMs) {
if(!::CreateTimerQueueTimer( &this->timer, TimerQueue::getHandle(),
(WAITORTIMERCALLBACK)&this->timerCallback,
this, startTimeMs, periodTimeMs,
WT_EXECUTEDEFAULT) ) {
this->timer = NULL;
}
}
virtual ~Timer() {
if(this->timer) {
::DeleteTimerQueueTimer(TimerQueue::getHandle(),&this->timer,NULL);
this->timer = NULL;
}
}
}
//derive and implement timerProc
class MyTimer : public Timer
{
protected:
virtual void timerProc() {
if(process.isRunning()) {
process.kill();
}
}
public:
MyTimer(DWORD startTimeMs,DWORD periodTimeMs)
: Timer(startTimeMs,periodTimeMs) {}
}
//usage:
int main(int argc,char* argv[]) {
MyTimer timer(0,100); //start immediately, at 10 Hz interval
}
`
disclaimer : i don't test or compile those codes, you should recheck it
Although you've tagged this as language-agnostic, any good implementation is going to vary widely not just from one language to another, but across operating systems. There are plenty of circumstances where programs or operating system functions need to do just this sort of thing, and mechanisms will have been implemented to do this in as sensible, non-intrusive a way as possible.
If you have a particular language and/or operating system in mind, please tell us, and give us a better idea of what you're trying to achieve. That way we can point you towards the most appropriate of the many possible solutions to this problem.