cython extensions using cuda - cuda

I have a conv net implementation as a C++ class. The class is built on top of a template library ( mshadow ) that generates CUDA code, so it takes the form of a header file. Consequently, it can only be used in files compiled using nvcc. I am now trying to wrap this class in Python for easier loading and saving of parameters, data, etc.
How do I go about wrapping the C++ class using Cython? I looked at npcuda-example, which demonstrates how to write a wrapper pyx file around a C++ class. Unfortunately, in this example, the pyx file compiles to a cpp file. This will not work for me because I need to include the class header in the pyx file and compile it using nvcc.
I believe I could use the setup.py from npcuda-example if there were some way to force the wrapper pyx file to compile to a cu file so that nvcc would be called when distutils tried to compile the extension.
Any ideas?

in the npcuda-example, the wrapper.pyx will combine *.cu by defining
cdef extern from "src/manager.hh"
I guess it is exactly what you want?

Related

Determining path of the translated cython pyx source file

I have a cython source file in which I would like to import the local python module.
This cython source file is translated using cython (python3 syntax) into a c++ source, which in turn is compiled into a library, and then used from the main C++ program.
When the main program is being executed the import of the local python module fails because the location of thereof module is not known to the executed code. I tried using python3's local import features in my pyx file but to no avail.
The only working solution I came up with (and the most obvious one) is to update python's module search path using sys.path.append. The problem is that I have to hardcode this path, which is ugly.
I tried to find any hints if it is possible to retrieve within cython code location of the source file (I could derive an absolute path from it) but without success. Usual pythonic ways to do that fail - for instance, one of the reasons is that __file__ evaluates to built-in, and retrieval of the absolute path at runtime gives the path where the executable is being run.
Sidenote: one of the searches I did was by querying GitHub search engine for occurrences of sys.path.append in cython files. Interestingly, all results either have paths hardcoded or they are not related to the location of the cython source file within the file system.
So my question is if it is possible within cython code to reliably retrieve the location of its source file?
Disclaimer: I could imagine instrumenting the build system to pass preprocessor variable set to the path in question while building the C++ file derived from the cython one, and then access this within the code, but this looks like an overkill.
Example:
bulba.py
def fn():
print('blah')
bulbulator.pyx
# tag: cpp
# tag: py3only
import sys
sys.path.append('/absolute_path_to_folder_with_bulba_py') # <-- this is the key part. I'd like to replace the hardcoded path with something better
from bulba import fn
fn()
bulbulator.pyx is translated into cpp with:
cython -3 --cplus bulbulator.pyx
lib_wrapper.cpp (this library, and executable which links against it, have a location different than that of py/pyx source code and its translated c++ part)
// import headers generated by cython
#include "bulbulator_api.h"
#include "bulbulator.h"
// global initialization of the cythonized part
__attribute__((constructor))
static void
__library_init()
{
if (int err = PyImport_AppendInittab("bulbulator", PyInit_bulbulator); err != 0)
{
std::fprintf(stderr, "PyImport_AppendInittab(bulbulator) failed with status code=%d\n", err);
std::exit(1);
}
Py_Initialize();
if (import_bulbulator() == -1) // <-- here it fails if I comment out sys.path.append, because bulbulator needs to know the location of bulba.py
{
PyErr_Print();
}
}
I would rather put the bulba.py next to the exe, but it is also possible to back in an absolute path into the Cython-extension, using for example a compile time environment variable (see cython --help for more details), let's call it ADDITIONAL_SYS_PATH:
import sys
sys.path.append(ADDITIONAL_SYS_PATH)
from bulba import fn
fn()
And now running Cython via:
cython -3 --cplus -E ADDITIONAL_SYS_PATH="the path to dir"
will set the right value to ADDITIONAL_SYS_PATH.
Cython saves the name of the pyx-file in the resulting cpp (for example for run-time error reporting), but this name doesn't include the full path, so we need to provide the path manually.

Inject a preprocessor definition into the Eclipse parser for a certain file type?

I'm using Eclipse CDT (actually, nVIDIA's nSight, but the same goes for both) to edit some source files. Now, some of these are meant for use both with nvcc and with regular host-side compilers, and have some instances of:
#ifdef __CUDACC__
something
#else
some other thing
#endif
I want to get the __CUDACC__ part when the preprocessor is reaching a file while parsing a .cuh or .cu, but not when reaching it while parsing a .h or .cpp (or .c). Now, I know I can inject a preprocessor define through the project settings (using the "built-in compiler" command line), but I was wondering whether it's possible to make that conditional on the extension of the file being parsed originally (i.e. the file being edited in the IDE).
How are you configuring the project's include paths and defined macros?
If you use the Build Output Parser, can you arrange for the build system to include -D __CUDACC__ in the compiler commands for .cu files, but not for the compiler commands for .cpp files?
CDT allows for each file in the project to have its own settings, and the Build Output Parser will assign each file that has a compilation command in the build output its own settings based on the flags that appear in the command, so things should just work.

Ideal way to handle F2PY compiler with multiple Fortran files

I'm using F2PY to compile my Fortran codes, but it's a little confusing how I can sort out the dependency between files.
For example, there are file A.f90 and B.f90, that B.f90 uses a module in A.f90. How would I compile these to get a dynamic library? My approach is
with open('A.f90') as fh:
source = fh.read()
with open('B.f90') as fh:
source += fh.read()
f2py.compile(source, ...)
But I don't think it's a good practice. I believe there will be better approaches for this. I would like to compile those independently but use modules from A as a dynamic library when compiling B. Any advice would be appreciated!
You can send the fortran files with dependencies through the extra_args option within the f2py.compile module.
Your code would then look like
with open('B.f90') as fh:
source = fh.read()
f2py.compile(source,....,extra_args=path_to_A.f90)

SWIG TCL Static Linking

I am trying to use SWIG to generate wrappers for some of my C++ function calls.
Also, I am trying to do build my own TCL shell so I need to static link the generated SWIG libraries. I have my own main function with a Tcl_AppInit call where I do some prior setup.
To do this what function should I include in my program's Tcl_AppInit call? I found that SWIG_init is not the right function. I even tried Cell_Init where cell is the name of the class in my code, but that doesn't help either.
How do I static link SWIG object files with my own main function and Tcl_Appinit call?
Currently when I use the following command to link my executabel I get the following error:
g++ -o bin/icde src/core/*.o src/read/*.o src/swig/*.o src/icde/*.o -ltk -ltcl
I get the following error:
src/icde/main.o: In function `AppInit(Tcl_Interp*)':
main.cpp:(.text+0xa9): undefined reference to `Cell_Init(Tcl_Interp*)'
collect2: ld returned 1 exit status
I checked the src/swig/cell.o file which has the Cell_Init function or not using objdump:
~> objdump -d src/swig/cell.o | grep Cell_Init
00006461 <Cell_Init>:
646c: 75 0a jne 6478 <Cell_Init+0x17>
I am not sure if I am doing something wrong while linking.
------------------- UPDATE ----------------------------
I found that including the swig/swig.cxx file directly in the main file which calls the Tcl_AppInit function resolves the linking issue. Is there a reason for this.
Isn't it possible to create and seprately link the swig file and the file with the main function?
In general, with SWIG you'll end up with a bunch of generated source files that you compile. The normal thing you do then is package them up into a shared library (with appropriate bound dependencies on other shared libraries) that can be imported into a Tcl runtime with the load command.
But you don't want that this time. Instead, you want the object files that you would use to make that shared lib, and you want to include them in the instructions to build an executable along with the object file that holds your main and Tcl_AppInit. You also need to make sure that when linking your main executable that you make it dependent on those external shared libraries; executable building requires that you satisfy all dependencies and make all symbols be bound to their definitions. (You can use a static library to make this easier: it combines a bunch of object files into one file. There's very little difference to just using the object files from it though; in particular, static libraries aren't bound to their dependencies.)
Finally, you do want to include a call to Cell_Init in your Tcl_AppInit. That's the right place to put it (well, as long as you're not arranging for the package to be loaded into sub-interpreters). If it was failing before, that was because you'd got your linking wrong. (Tip: linkers work best when objects and libraries on the link line only depend on things later on the link line. Getting the link order right is a bit of a black art when you've got a complex build!)

CImg library in Cuda

I am working on a code in CUDA C on VS2008 ,Win 7. I got a matrix of float which is to be displayed as image ..i saved it as .bin file and load it in a separate .cpp file and successfully formed the image using CImg library...however when i try to add the similar code to .cu file it gives a strange error as shown below on compilation...
error: identifier "_ZN12cimg_library4cimg9superset2IfffE4typeE" is undefined
The code snippet i tried adding in .cu file is given as under
#include <CImg.h>
using namespace cimg_library;
....host code.....continues...
CImg<float> img1(448,448);
for (int nn=0;nn<200704;nn++)
img1[nn] = dR[nn]; // dR is obtained after cudamemcpy DtoH
img1.display();
At forums i cant find much help regarding this as well as use of CImg with Cuda..
is there any way i can use CImg with cuda..
Thanks
My suggestion is to move the code that uses CImg to a .cpp file. The code in the .cpp file would then invoke the host/device code in the .cu file. The code in the .cu file then returns a pointer or reference to the matrix of floats back to the code in the .cpp file.
Nvidia's nvcc is a compiler driver. It invokes a C/C++ compiler to compile files with a .c or .cpp file name. However, a .cu file has special meaning to nvcc. It does some parsing and what-not to look for kernel functions and certain #pragmas. I'm not an expert, but I know there is a copy a manual floating around. Here is a link to an older copy of the manual.