CUDA - invalid device function, how to know [architecture, code]? - cuda

I am getting the following error when running the default generated kernel when creating a CUDA project in VS Community:
addKernel launch failed: invalid device function
addWithCuda failed!
I searched for how to solve it, and found out that have to change the Project->Properties->CUDA C/C++->Device->Code Generation(default values for [architecture, code] are compute_20,sm_20), but I couldn't find the values needed for my graphic card (GeForce 8400 GS)
Is there any list on the net for the [architecture, code] or is it possible to get them by any command?

The numeric value in compute_XX and sm_XX are the Compute Capability (CC) for your CUDA device.
You can lookup this link http://en.wikipedia.org/wiki/CUDA#Supported_GPUs for a (maybe not complete) list of GPUs and there corresponding CC.
Your quite old 8400 GS (when I remember correctly) hosts a G86 chip which supports CC 1.1.
So you have to change to compute_11,sm_11
`

Related

What's the replacement for cuModuleGetSurfRef and cuModuleGetTexRef?

CUDA 12 indicates that these two functions:
CUresult cuModuleGetSurfRef (CUsurfref* pSurfRef, CUmodule hmod, const char* name);
CUresult cuModuleGetTexRef (CUtexref* pTexRef, CUmodule hmod, const char* name);
which obtain a reference to surface or a texture, respectively, from a loaded module - are deprecated.
What are they deprecated in favor of? Are surfaces and textures in modules to be accessed differently? Will they be entirely out of modules? If it's the latter, how would one work with them using the CUDA driver API?
So, based on #talonmies' comment, it seems the "replacement" are "texture objects" and "surface objects". The main difference - as far as is evident in the API - is that the new "objects" have less API calls, which take richer descriptors. Thus, the user sets fields themselves, and does not need the large number of cuTexRefGetXXXX and cuTexRefSetXXXX calls. There are also "tensor map objects", appearing with Compute Capability 9.0 and later.

How to determine the state of peer access without producing a warning in cuda-memcheck

In a multi-gpu system, I use the return value of
cudaError_t cudaDeviceDisablePeerAccess ( int peerDevice ) to determine if peer access is disabled. In that case, the function returns cudaErrorPeerAccessNotEnabled
This is not an error in my program, but produces a warning in both cuda-gdb and cuda-memcheck since an API call did not return cudaSuccess.
In the same manner cudaDeviceEnablePeerAccess returns cudaErrorPeerAccessAlreadyEnabled if access has already been enabled.
How can one find out if peer access is enabled / disabled without producing a warning?
Summarizing comments into an answer: you can't.
The runtime API isn't blessed with the ability to have informational/warning level status returns and error returns. Everything which isn't success is treated as an error. And the toolchain utilities like cuda-memcheck cannot be instructed to ignore errors. The default beahaviour is to report and continue, so it will not interfere with anything, but it will emit an error message.
If you want to avoid the errors then you will need to build some layers of your own state tracking and condition preemption to avoid potential errors being returned.

How to find more details on CUDA_ERROR_INVALID_VALUE?

As a side question to Use Vulkan VkImage as a CUDA cuArray, how could I get more details on what's wrong on a CUDA Driver API call that returns CUDA_ERROR_INVALID_VALUE?
Specifically, the call is to cuExternalMemoryGetMappedMipmappedArray() and the documentation does not list CUDA_ERROR_INVALID_VALUE among its return values.
Any suggestions on how to go about debugging this issue?
Specifically, the call is to cuExternalMemoryGetMappedMipmappedArray() and the documentation does not list CUDA_ERROR_INVALID_VALUE among its return values.
That appears to be have been a transient documentation error. The current documentation linked in the question (CUDA 11.5 at the time of writing), shows CUDA_ERROR_INVALID_VALUE as an expected return value.
As for the debugging part, the function only has two inputs, the memory object handle, and the array descriptor. One of those is invalid. It should be trivial to debug if you know that the function call is returning the error, and not a prior call.

PIE disabled. Absolute addressing not allowed in code signed PIE

I'm working with Xcode 4.5 with a deployment target of iOS 5.1
I'm getting the following warning when I compile my app in relation to two specific methods which have significantly increased in size.
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not allowed in code signed PIE, but used in -[mfile method]. To fix this warning, don't compile with -mdynamic-no-pic or link with -Wl,-no_pie
And subsequently the app is throwing an exception at launch with the following error:
dyld: vm_protect(0x00001000, 0x0078C000, false, 0x07) failed, result=2 for segment __TEXT in /var/mobile/Applications/8E764612-87ED-4A99-9C59-E56C934DA997/appname.app/appname
dyld dyld_fatal_error:
0x2feb20c4: trap
0x2feb20c8: nop
When I comment out the methods in question, the app runs fine.
Any suggestions?
Here is a response from the Unity forums:
In xcode 4.6 build settings change "Dont create position independent executables" from NO to Yes, thats it.
Credits go to amit-chai

When does cublasInit() return a NOT_INITIALIZED status?

during my cublas initialization, i get an error, i.e. not the wanted CUBLAS_STATUS_SUCCESS.
Checking the returned status, i figured out that the returned status is CUBLAS_STATUS_NOT_INITIALIZED which is not listed as possible returns of that function.
Does anyone have an idea what may have caused that behavior?
The CUBLAS 4.x documentation mentions CUBLAS_STATUS_NOT_INITIALIZED as error code for cublasCreate with the meaning "the CUDA Runtime initialization failed".
Can you verify that you have a valid CUDA context?
If so, did you create a valid CUBLAS context?
For CUBLAS 3.x and CUBLAS 4.x using the legacy API: did you call cublasInit while there is a CUDA context in the current thread active, and did it return CUBLAS_STATUS_SUCCESS?
For CUBLAS 4.x with new API: did you call cublasCreate and did it return CUBLAS_STATUS_SUCCESS? Are you using the handle created when calling cublas..._v2 methods?