I'm trying to optimize some molecular simulation code (written completely in fortran) by using GPUs. I've developed a small subroutine that performs matrix vector multiplication using the cuBLAS fortran binding library (non-thunking - /usr/local/cuda/src/fortran.c on Linux).
When I tested the subroutine outside of the rest of the code (i.e. without any other external subroutine calls) everything worked. When I compiled, I used these tags -names uppercase -assume nounderscore. Without them, I would receive undefined reference errors.
When porting this into the main function of the molecular dynamics code, the -assume nounderscore -names uppercase tags mess up all of my other function calls in the main program.
Any idea of a way around this? Please refer to my previous question where -assume nounderscore -names uppercase was suggested here
Thanks in advance!
I would try Fortran-C interop. With something like
interface
function cublas_alloc(argument list) bind(C, name="the_binding_name")
defs of arguments
end function
end interface
the binding name can be upper case or lowercase, whatever you need, for example, bind(C,name="CUBLAS_ALLOC"). No underscores will be appended to that.
The iso_c_binding module might be also helpful.
Related
I am new to Numba and am trying to apply it to an existing NumPy code that is very FLOP intensive. The function I want to apply #jit to, however, calls other functions that, in turn, use the numpy fft module.
It seems that, in order to apply #jit to a function, it must also be applied to the functions it calls. As a consequence, I cannot apply #jit to my function - it would require applying it to all functions it calls and, ultimately, to the functions that use the fft module that is not supported by Numba. Is there a way around this? For instance, a way to let Numba know the data type of the variables returned by the functions called and instructing it to leave them alone? So that I can apply #jit only to the one function and not to those it calls?
The usual way to solve this problem is to split the functions in three parts: a header function using Numba, a pure-Python function (calling np.fft) and a footer Numba function. An overall pure-Python function coordinate the call to others. Using functions operating on big arrays is faster in this case due to the overhead of CPython loops.
That being said, there is an experimental Numba feature called objmode meant to solve this specific problem. You need to specify the type of the input/output objects of the section and there are few limitations mentioned in the documentation. Note that is currently quite unstable since I encountered few non-deterministic compilation errors (while the code was valid) and crashes while using it so far.
Suppose I have a pointer to a __global__ function in CUDA. Is there a way to programmatically ask CUDART for a string containing its name?
I don't believe this is possible by any public API.
I have previously tried poking around in the driver itself, but that doesn't look too promising. The compiler emitted code for <<< >>> kernel invocation clearly registers the mangled function name with the runtime via __cudaRegisterFunction, but I couldn't see any obvious way to perform a lookup by name/value in the runtime library. The driver API equivalent cuModuleGetFunction leads to an equally opaque type from which it doesn't seem possible to extract the function name.
Edited to add:
The host compiler itself doesn't support reflection, so there are no obvious fancy language tricks that could be pulled at runtime. One possibility would be to add another preprocessor pass to the compilation trajectory to build a static kernel function lookup table before the final build. That would be rather a lot of work, but it could be done, at least for "classic" compilation where everything winds up in a single translation unit.
I hear the term "function application" used (mostly related to Haskell), and it seems like it just means "calling a function". The wikipedia page basically calls it a mathematical term for calling a function:
In mathematics, function application is the act of applying a function to an argument from its domain so as to obtain the corresponding value from its range.
What is the difference between calling a function and function application?
Calling a function seems to imply that you are invoking a runtime operation in a programming language, which will execute an abstraction to figure out the results of what a function does. Function application seems more like a generalized term to use when we'd like to talk about... function application... at any time, e.g.: at compile-time, syntactically, or mathematically.
Function application may also refer to apply. Historically in various programming languages, apply is a higher-order function that takes a function reference, an argument list, and whose result should be f(argument list).
In Haskell, function application most likely refers to currying a function by one argument. In Haskell, all you need are spaces to represent function application (the $ operator does nothing but change the precedence/grouping, to allow less parentheses; as opposed to LISP). Contrast this with the "normal" notation we learn in basic algebra and use in non-functional programming, where f(a,b,c) represents the function f applied to arguments a,b,c. I don't think you'd use the term "call a function" unless you were dealing with an abstraction that actually called functions; which I'm not even sure Haskell has. Haskell might for example have an abstraction which reduces functions by pattern-matching... or using "call a function" might be reasonable in Haskell.
Barebones explanation:
Haskell and other functional languages are much more abstractly mathematical as far as what a function is for.
In a procedural language you call a function, which is a collection of statements which may or may not operate on data.
In a functional language you have data, and you apply a function to it to do something with the data.
So, I think I have a very weird question.
So, let say that I already have a program put on my GPU and in that program I call a function X. But that function X is not declared yet.
I want to be able, dynamically, to modify that function X, by completely changing the code and put it in the program without recompiling the rest or losing any pointers whatsoever.
To compare it with something that most of us know, I want to be able to do like the shaders in OpenGL. In the middle of the execution, I can change the code of one shader, only recompile that shader, activate the program and now I used this one.
So, is it possible. Or do I need to recompile the whole thing all the time. And if I have to recompile, do I lose the various arrays that I created in global memory ?
Thanks
W
If you compile with the -cuda flag using nvcc, you can get the intermediate C++ source that streams PTX to the processor. In theory, you could post-process this intermediate output to dynamically generate PTX on the fly and send it over. You might even be able to have PTX be self modifying, but that's way out of my league.
I have a relatively basic question, I've been having trouble calling a function from a separate file. My googling has come up short, there is a lot for the other languages but not much in the way of MIPS.
Any help would be appreciated
MIPS isn't a language, it is an instruction set architecture.
Assuming you really mean that you are programming in MIPS assembler AND you are using the GCC toolchain including GNU assembler, you need to declare your function with a .global myfunc in the file where it is implemented, then the linker should be able to resolve the function name where it is used in another file e.g. jal myfunc.
You don't need to use an .extern myfunc directive in the file where myfunc is used because the GNU tools treat all undefined symbols as external.
If you are using MARS, then none of this applies.