In CUDA we can get to know about errors simply by checking return type of functions such as cudaMemcpy(), cudaMalloc() etc. which is cudaError_t with cudaSuccess. Is there any method available in JCuda to check error for functions such as cuMemcpyHtoD(), cuMemAlloc(), cuLaunchKernel() etc.
First of all, the methods of JCuda (should) behave exactly like the corresponding CUDA functions: They return an error code in form of an int. These error codes are also defined in...
the cudaError class for the Runtime API
the CUresult class for the Driver API
the cublasStatus class for JCublas
the cufftResult class for JCufft
the curandStatus class for JCurand
the cusparseStatus class for JCusparse
and are the same error codes as in the respective CUDA library.
All these classes additionally have a static method called stringFor(int) - for example, cudaError#stringFor(int) and CUresult#stringFor(int). These methods return a human-readable String representation of the error code.
So you could do manual error checks, for example, like this:
int error = someCudaFunction();
if (error != 0= {
System.out.println("Error code "+error+": "+cudaError.stringFor(error));
}
which might print something like
Error code 10: cudaErrorInvalidDevice
But...
...the error checks may be a hassle. You might have noticed in the CUDA samples that NVIDIA introduced some macros that simplify the error checks. And similarly, I added optional exception checks for JCuda: All the libraries offer a static method called setExceptionsEnabled(boolean). When calling
JCudaDriver.setExceptionsEnabled(true);
then all subsequent method calls for the Driver API will automatically check the method return values, and throw a CudaException when there was any error.
(Note that this method exists separately for all libraries. E.g. the call would be JCublas.setExceptionsEnabled(true) when using JCublas)
The samples usually enable exception checks right at the beginning of the main method. And I'd recommend to also do this, at least during the development phase. As soon as it is clear that the program does not contain any errors, one could disable the exceptions, but there's hardly a reason to do so: They conveniently offer clear information about which error occurred, whereas otherwise, the calls may fail silently.
Related
As a side question to Use Vulkan VkImage as a CUDA cuArray, how could I get more details on what's wrong on a CUDA Driver API call that returns CUDA_ERROR_INVALID_VALUE?
Specifically, the call is to cuExternalMemoryGetMappedMipmappedArray() and the documentation does not list CUDA_ERROR_INVALID_VALUE among its return values.
Any suggestions on how to go about debugging this issue?
Specifically, the call is to cuExternalMemoryGetMappedMipmappedArray() and the documentation does not list CUDA_ERROR_INVALID_VALUE among its return values.
That appears to be have been a transient documentation error. The current documentation linked in the question (CUDA 11.5 at the time of writing), shows CUDA_ERROR_INVALID_VALUE as an expected return value.
As for the debugging part, the function only has two inputs, the memory object handle, and the array descriptor. One of those is invalid. It should be trivial to debug if you know that the function call is returning the error, and not a prior call.
Is there a way to pass any meaningful message, not “std::exception” to the promise’s fail callback? In the sources I found the following “void FB::variantDeferred::reject(std::exception e) const” specification. It seems when reject is called with any exception derived from std::exception the slicing happens and the right exception’s message is lost. Is there any workaround but to pass error through success callback?
std::exception is simply a base class for creating exceptions in C++. You can use a number of different methods to pass a specific string back; for example, you could throw a std::runtime_error, which accepts a message.
You could also subclass std::exception and provide an implementation of std::exception::what which returns a useful string representation of what you want.
FireBreath 2.0 will use the error message from e.what() when it creates the Error object. You can find this in the code, if you're curious how that works:
NPAPI
ActiveX
FireWyrm (used for Native Messaging)
I'm just using NewRelic error trapping for my coldbox application. From OnException method, I'm just sending the error struct to log the error.
My code in onexception method
public function onException(event,rc,prc){
NewRelic.logError( prc.exception.getExceptionStruct());
}
The logerror() method resides in NewRelic.cfc and contains the following code
public boolean function logError(
required struct exception
) {
var cause = arguments.exception;
var params = {
error_id = createUUID(),
type: arguments.exception.type,
message: arguments.exception.message
};
writeDump(this.newRelic);
this.newRelic.noticeError(cause, params);abort;
return true;
}
So while error, I'm gettig the following error.
The noticeError method was not found.
You can see that, the noticeError() method is there in the object, but it is overloaded with arguments.
I'm using the same code for NewRelic error trapping in another coldfusion project without any frameworks.
Calling error.cfm through Cferror tag, and the code in error.cfm as follows
<cfset Application.newRelic.logError( variables.error )>
And in NewRelic.cfc, the logerror() method contains the same code as in the coldbox application. But it is logging errors in NewRelic without any issues.
This is the method I need to notice errors and log it in NewRelic.
noticeError(java.lang.Throwable, java.util.Map)
So I just thought to get the classname of the first argument Cause through the following code from both applications within logError() in NewRelic.cfc, to get the difference.
writeDump(cause.getClass().getName());
I'm getting
coldfusion.runtime.ExceptionScope for Coldbox application
and
coldfusion.runtime.UndefinedVariableException for normal coldfusion application
The cause argument is not throwable from coldbox application. So how to get the original error struct from coldbox application? and make it throwable to fix the noticeError method was not found issue.
The change in the underlying class happens when ColdBox duplicates the error object with CFML's duplicate() method. I doubt that ColdFusion behavior is documented anywhere, but I don't see an easy way to get around it right now other than creating your own instance of a java.langException and populating it with the details of the original error.
If you want to modify the ColdBox core code, this happens here:
https://github.com/ColdBox/coldbox-platform/blob/master/system/web/context/ExceptionBean.cfc#L43
I have entered this ticket for the ColdBox framework for us to review if we can stop duplicating the error object in future versions of the framework.
https://ortussolutions.atlassian.net/browse/COLDBOX-476
Update: Adam Cameron pointed out this ticket in the Adobe bug tracker that details this behavior in the engine. It was closed as "neverFix".
https://bugbase.adobe.com/index.cfm?event=bug&id=3976478
In Lablgtk, whenever there is an exception in a callback, the exception is automatically caught and an error message is printed in the console, such as:
(prog:12345) LablGTK-CRITICAL **: gtk_tree_model_foreach_func:
callback raised an exception
This gives no stack trace and no details about the exception, and because it is caught I cannot retrieve this information myself.
Can I enable more detailed logging information for this case? Or prevent the exception from being caught automatically?
I guess the best way to do so is to catch your exception manually and handle it yourself.
let callback_print_exn f () =
try f () with
e -> my_exn_printer e
Assuming val my_exn_printer : exn -> unit is your custom exception printer, you can simply print your callbacks exceptions by replacing ~callback:f by ~callback:(callback_print_exn f) in your code.
Of course, you can also with that method send that exception to another
thread, register a "callback id" that would be passed along with your exception...
About the stack trace, I'm not sure you can retrieve it easily. As it's launched as a callback, you probably want to know the signal used and that can be stored in your callback handler.
I had another similar issue, but this time it was harder to find where to put the calls to intercept the exception.
Fortunately, this time there was a very specific error message coming from the Glib C code:
GLib-CRITICAL **: Source ID ... was not found when attempting to remove it`
Stack Overflow + grep led me to the actual C function, but I could not find which of the several Lablgtk functions bound to this code was the culprit.
So I downloaded the Glib source, added an explicit segmentation fault to the code, compiled it and used LD_LIBRARY_PATH to force my modified Glib version to be loaded.
Then I ran the OCaml binary with gdb, and I got my stack trace, with the precise line number where the Lablgtk function was being called. And from there it was a quick 3-line patch.
Hacks like this one (which was still faster than trying to find where to intercept the call) could be avoided by having a "strict mode" preventing exceptions from being automatically caught. I still believe such a switch should be available for Lablgtk users, and hope it will eventually be available.
When I design a class I often have trouble deciding if I should throw an exception or have 2 func with the 2nd returning an err value. In the case of 2 functions how should I name the exception and non exception method?
For example if I wrote a class that decompresses a stream and the stream had errors or incomplete I would throw an exception. However what if the app is trying to recover data from the stream and excepts an error? It would want a return value instead? So how should I name the 2nd function?
Or should I not have both an exception method and a nonexception method?
Or should I not have both an exception method and a nonexception method?
That. Unless you really have time to burn maintaining two separate but mostly-identical methods.
If you really need to allow for clients that won't consider errors exceptional, then just indicate them with a return value and be done with it... Otherwise, just write the exception-throwing version and let the odd error-eating client handle the exception.
It depends on the language, but...
In my opinion, two versions of each potentially failing method imposes too high a cognitive burden on the API user, and too high a burden on the API maintainer. My personal preference is for exceptions, since that's fewer parameters to remember the order of.
I believe that you should try to use exceptions even if you're not going to exit program in some cases. You just need to create a specific exception type for your errors. And catch only them when you need to do some spefic logic. All other exceptions will go to upper level of your code.
For example you've created function which throws exception on any error. And you don't want to exit program if user specified incorrect file name. Here is how it can look:
## this is top level try/catch block
try {
## your main code is here
...
## somewhere deep in your code
try {
## we trying to open file specified by user
}
catch (FileNotFoundException) {
## we are not going to exit on this error
## let's just show a user an error message
## and try to ask different file to open
}
} catch (Exception) {
## catch all exceptions here
## the best thing we can do here is save exception to log and quit }
}
We just need to create hierarchy of exceptions (if your language allows it):
Exception <-- MoreSpecificException
This approach is used in Java
Exceptions should normally be used for exceptional conditions, whatever they may be. Being unable to decompress a file is probably an exceptional condition (unless, for example, you're writing a program to scan for badly compressed files). Bad data may or may not be exceptional.
If you've got a class decompressing a stream, then what it should do is decompress the stream, and not try to interpret its contents. Another class should use the first class to get decompressed input, and do the interpretation. That gives you separation of functionality, and good cohesion. Avoid classes where you're tempted to put an "And" in their names: "DecompressAndParseInput" is a bad class name.
Given two classes, there's no particular reason why you have to use the same error-reporting
method for both. The decompressor could throw, and the parser could return an error code.
I would only throw an exception on the decompression function if the results were not usable. If they were usable, return the results and then the function that reads those results can throw an exception instead. IE
try
{
results = decompress(file); // only throws exceptions on non-usable files
} catch (FileNotFoundException) {
// file was not usable, can't recover anything, insert nuclear error handling here
}
try
{
read(results);
} catch (ErrorThatIsRecoverableException) {
partialRead(results);
}
so read() is the function you call on normal data, partialRead is the function that handles trying to recover screwed-up-but-still-usable data. And of course, you don't necessarily need exceptions or separate functions at all- you could do the error handling all within the read() function.