check if nvcc is available in makefile - cuda

I have two versions of a function in an application, one implemented in CUDA and the other in standard C. They're in separate files, let's say cudafunc.h and func.h (the implementations are in cudafunc.cu and func.c). I'd like to offer two options when compiling the application. If the person has nvcc installed, it'll compile the cudafunc.h. Otherwise, it'll compile func.h.
Is there anyway to check if a machine has nvcc installed in the makefile and thus adjust the compiler accordingly?
Thanks a bunch,

You could try a conditional, like
ifeq (($shell which nvcc),) # No 'nvcc' found
func.o: func.c func.h
HEADERS += func.h
else
func.o: cudafunc.cu cudafunc.h
nvcc -o $# -c $< . . .
CFLAGS += -DUSE_CUDA_FUNC
HEADERS += cudafunc.h
endif
And then in the code that will call this function, it can test #if USE_CUDA_FUNC to decide which header to include and which interface to call.

This should work, included in your Makefile:
NVCC_RESULT := $(shell which nvcc 2> NULL)
NVCC_TEST := $(notdir $(NVCC_RESULT))
ifeq ($(NVCC_TEST),nvcc)
CC := nvcc
else
CC := g++
endif
test:
#echo $(CC)
For GNU make, the recipe line(s) (after test:) actually start with tab characters.
With the above preamble, you can use conditionals based on the CC variable to control the remainder of your Makefile.
You would change the test: target to whatever target you want to be built conditionally.
As a test, just run the above with make. You should get output of nvcc or g++ based on whatever was detected.

Related

Emscripten - how to use my makefile with emcc instead of gcc?

I have a project in C called triple where you can add, delete and match some Triplets. The idea now is to transform it to html using emcc and emmake.
I tried to compile it with:
emmake make
And then use:
emcc triple.o -s WASM=1 -o triple.html
But I get the error :
WARNING:root:triple.o is not valid LLVM bitcode
ERROR:root:no input files
note that input files without a known suffix are ignored, make sure
your input files end with one of: ('.c', '.C', '.i', '.cpp', '.cxx',
'.cc', '.c++', '.CPP', '.CXX', '.CC', '.C++', '.ii', '.m', '.mi',
'.mm', '.mii', '/dev/null', '.bc', '.o', '.obj', '.lo', '.dylib',
'.so', '.a', '.ll', '.h', '.hxx', '.hpp', '.hh', '.H', '.HXX', '.HPP',
'.HH')
What am I missing? Is there another way to use the make file with emcc instead with gcc?
Here is the make file I am using.
CC=emcc
triple : triple.o insert.o match.o Delete.o printList.o writeList.o
$(CC) -o triple triple.o insert.o match.o Delete.o printList.o
writeList.o
triple.o : triple.c
$(CC) -g -c triple.c
insert.o : insert.c
$(CC) -g -c insert.c
match.o : match.c
$(CC) -g -c match.c
Delete.o : Delete.c
$(CC) -g -c Delete.c
printList.o : printList.c
$(CC) -g -c printList.c
writeList.o : writeList.c
$(CC) -g -c writeList.c
The emcc program is a compiler front-end. This means it takes source code as input. You do not need to compile the code first with GCC. The emscripten website says it best: "use Emscripten Compiler Frontend (emcc) as a drop-in replacement for gcc in your existing project."
If you have the source there seems to be no good reason to compile to LLVM first.
What you need to do is simply replace any reference to gcc in your Makefile with emcc.
Even better - add a CC variable and use this. eg
CC=emcc
then replace all references to the compiler with $(CC). the $ bit is how to access a variable in a Makefile. Using a variable means you can easily change the compiler later.
You haven't specified an output target for the .o lines, so I think it will build to a.o.
Try modifying your makefile to say:
$(CC) -g -c triple.c -o triple.o

Kernel seem not to execute

I'm a beginner when it comes to CUDA programming, but this situation doesn't look complex, yet it doesn't work.
#include <cuda.h>
#include <cuda_runtime.h>
#include <iostream>
__global__ void add(int *t)
{
t[2] = t[0] + t[1];
}
int main(int argc, char **argv)
{
int sum_cpu[3], *sum_gpu;
sum_cpu[0] = 1;
sum_cpu[1] = 2;
sum_cpu[2] = 0;
cudaMalloc((void**)&sum_gpu, 3 * sizeof(int));
cudaMemcpy(sum_gpu, sum_cpu, 3 * sizeof(int), cudaMemcpyHostToDevice);
add<<<1, 1>>>(sum_gpu);
cudaMemcpy(sum_cpu, sum_gpu, 3 * sizeof(int), cudaMemcpyDeviceToHost);
std::cout << sum_cpu[2];
cudaFree(sum_gpu);
return 0;
}
I'm compiling it like this
nvcc main.cu
It compiles, but the returned value is 0. I tried printing from within the kernel and it won't print so I assume i doesn't execute. Can you explain why?
I checked your code and everything is fine. It seems to me, that you are compiling it wrong (assuming you installed the CUDA SDK properly). Maybe you are missing some flags... That's a bit complicated in the beginning I think. Just check which compute capability your GPU has.
As a best practice I am using a Makefile for each of my CUDA projects. It is very easy to use when you first correctly set up your paths. A simplified version looks like this:
NAME=base
# Compilers
NVCC = nvcc
CC = gcc
LINK = nvcc
CUDA_INCLUDE=/opt/cuda
CUDA_LIBS= -lcuda -lcudart
SDK_INCLUDE=/opt/cuda/include
# Flags
COMMONFLAGS =-O2 -m64
NVCCFLAGS =-gencode arch=compute_20,code=sm_20 -m64 -O2
CXXFLAGS =
CFLAGS =
INCLUDES = -I$(CUDA_INCLUDE)
LIBS = $(CUDA_LIBS)
ALL_CCFLAGS :=
ALL_CCFLAGS += $(NVCCFLAGS)
ALL_CCFLAGS += $(addprefix -Xcompiler ,$(COMMONFLAGS))
OBJS = cuda_base.o
# Build rules
.DEFAULT: all
all: $(OBJS)
$(LINK) -o $(NAME) $(LIBS) $(OBJS)
%.o: %.cu
$(NVCC) -c $(ALL_CCFLAGS) $(INCLUDES) $<
%.o: %.c
$(NVCC) -ccbin $(CC) -c $(ALL_CCFLAGS) $(INCLUDES) $<
%.o: %.cpp
$(NVCC) -ccbin $(CXX) -c $(ALL_CCFLAGS) $(INCLUDES) $<
clean:
rm $(OBJS) $(NAME)
Explanation
I am using Arch Linux x64
the code is stored in a file called cuda_base.cu
the path to my CUDA SDK is /opt/cuda (maybe you have a different path)
most important: Which compute capability has your card? Mine is a GTX 580 with maximum compute capability 2.0. So I have to set as an NVCC flag arch=compute_20,code=sm_20, which stands for compute capability 2.0
The Makefile needs to be stored besides cuda_base.cu. I just copy & pasted your code into this file, then typed in the shell
$ make
nvcc -c -gencode arch=compute_20,code=sm_20 -m64 -O2 -Xcompiler -O2 -Xcompiler -m64 -I/opt/cuda cuda_base.cu
nvcc -o base -lcuda -lcudart cuda_base.o
$ ./base
3
and got your result.
Me and a friend of mine created a base template for writing CUDA code. You can find it here if you like.
Hope this helps ;-)
I've had the exact same problems. I've tried the vector sum example from 'CUDA by example', Sanders & Kandrot. I typed in the code, added the vectors together, out came zeros.
CUDA doesn't print error messages to the console, and only returns error codes from the functions like CUDAMalloc and CUDAMemcpy. In my desire to get a working example, I didn't check the error codes. A basic mistake. So, when I ran the version which loads up when I start a new CUDA project in Visual Studio, and which does do error checking, bingo! an error. The error message was 'invalid device function'.
Checking out the compute capability of my card, using the program in the book or equivalent, indicated that it was...
... wait for it...
1.1
So, I changed the compile options. In Visual Studio 13, Project -> Properties -> Configuration Properties -> CUDA C/C++ -> Device -> Code Generation.
I changed the item from compute_20,sm_20 to compute_11,sm_11. This indicates that the compute capability is 1.1 rather than the assumed 2.0.
Now, the rebuilt code works as expected.
I hope that is useful.

calling makefiles - handling exceptions

I realized that as I run makefiles from my main makefile, if they child makefiles fail, the parent continues and does not return with an error exit code.
I've tried to add the exception handling...but it does not work. Any ideas?
MAKE_FILES := $(wildcard test_*.mak)
compile_tests:
#echo "Compiling tests.$(MAKE_FILES)."
#for m in $(MAKE_FILES); do\
$(MAKE) -f "$$m"; || $(error Failed to compile $$m)\
done
You cannot use make functions like $(error ...) in your recipe, because all make variables and functions are expanded first, before the shell is invoked. So the error function will happen immediately when make tries to run that recipe, before it even starts.
You have to use shell constructs to fail, not make constructs; something like:
compile_tests:
#echo "Compiling tests.$(MAKE_FILES)."
#for m in $(MAKE_FILES); do \
$(MAKE) -f "$$m" && continue; \
echo Failed to compile $$m; \
exit 1
done
However, even this is not really great, because if you use -k it will still stop immediately. Better is to take advantage of what make does well, which is run lots of things:
compile_tests: $(addprefix tests.$(MAKE_FILES))
$(addprefix tests.$(MAKE_FILES)): tests.%:
$(MAKE) -f "$*"
One note, if you enable -j these will all run in parallel. Not sure if that's OK with you or not.

Cuda 5.0 Linking Issue

I'm just trying to build an old project of mine using cuda 5.0 preview.
I get an Error when linking, telling me that certain cuda functions can not be found. For example:
undefined reference to 'cudaMalloc'.
My linking command includes the following options for cuda :
-L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -lcudart
ls -lah /usr/local/cuda/lib64/ gives me 8 cuda libraries including libcudart.so.5.0.7 with symlinks using only the .so-file-ending.
ls /home/myhome/NVIDIA_CUDA_Samples/C/lib/ gives me an empty directory, which is kind of strange?
ls /home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux/ gives me two directories: i686 and x86_64 both containing only libGLEW.a
I have no idea which way to look for a solution. Any help is appreciated!
EDIT:
Here is my complete linking command (TARGET_APPLICATION is my binary and x86_64/Objectfiles.o stands for all (23) object files including the object file compiled with nvcc):
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src -I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -lcudart -o TARGET_APPLICATION x86_64/Objectfiles.o /usr/lib/liblapack.so /usr/lib/libblas.so /home/myhome/nullmpi-0.7/lib/libnullpmpi.a -lm
I use nullmpi for compilation and linking (project uses MPI and CUDA), which internally uses g++ as can be seen by -CC=g++, i wanted to keep this stuff out.
The compilation command for my cuda object file:
/usr/local/cuda/bin/nvcc -c -arch=sm_21 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -O3 kernelwrapper.cu -o x86_64/kernelwrapper.RELEASE.2.o
echo $LD_LIBRARY_PATH results in:
/usr/local/cuda/lib64:/usr/local/cuda/lib:
echo $PATH results in:
otherOptions:/usr/local/cuda/bin:/home/myhome/nullmpi-0.7/bin
I'm building 64-bit. For the sake of completeness I'm building on Ubuntu 12.04. (64bit).
Building the CUDA Samples works fine.
SOLUTION (thanks to talonmies for pointing me to it):
This is the correct linking command:
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src -I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -o TARGET_APPLICATION x86_64/Objectfiles.o /usr/lib/liblapack.so /usr/lib/libblas.so /home/myhome/nullmpi-0.7/lib/libnullpmpi.a -lcudart -lm
You have your linking statements in the incorrect order. It should be something more like this:
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src \
-I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 \
-L/home/myhome/NVIDIA_CUDA_Samples/C/lib \
-L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux \
-o TARGET_APPLICATION x86_64/Objectfiles.o \
/home/myhome/nullmpi-0.7/lib/libnullpmpi.a -llapack -lblas -lm -lcudart
The source of your problem is that you have specified the CUDA runtime library before the object file that contains a dependency to it. The linker simply discards libcudart.so from the linkage because there are no dependencies to it at the point when it is processed. Golden rule in POSIX style compilation statements: linkage statements are parsed left-to-right; so objects containing external dependencies first, libraries satisfying those dependencies afterwards.

CUDA, MySQL, and CMake

I'm trying to create a CUDA program (which I'm new at) that involves first grabbing information from a remote MySQL database. I'm using the Connector/C library from the MySQL website inside the program, before the CUDA calls.
I'm able to compile my program with MySQL when using gcc (without any CUDA code), but not with nvcc (the CUDA compiler). A peer who is familiar with CUDA mentioned to me that he had to compile some libjpg stuff he was doing with nvcc to avoid 'wrong architecture' and linking problems. He suggested that I compile the Connector/C library with nvcc. However, the Connector/C library uses CMake instead of a regular Makefile.
So, being new to CMake, I researched some stuff and found the toolchain file which sounded a lot like what I needed (found here). However, I am running into problems during the compile where all of the default includes and libraries used in Connector/C are not included. Specifically
-- Looking for include files HAVE_ALLOCA_H
-- Looking for include files HAVE_ALLOCA_H - not found.
and
-- Looking for strstr
-- Looking for strstr - not found
Those are just a couple examples, there are many more files that are not found.
Am I approaching this problem correctly? Is there a more obvious workaround that I am just not considering? If I am right in trying to compile MySQL Connector/C with CUDA, are there any suggestions for properly including the files and libraries required for Connector/C?
Thanks for your help.
If you can separate out the CUDA kernels from your mysql calls and place them in separate files, you can use your Makefile.
I keep all of the cuda kernels and such in .cu files and then I have a definition:
#
# CUDA Compilation Rules
#
define cuda-compile-rule
$1: $(call generated-source,$2) \
$(call source-dir-to-build-dir, $(subst .cu,.cubin, $2)) \
$(call source-dir-to-build-dir, $(subst .cu,.ptx, $2))
$(NVCC) $(CUBIN_ARCH_FLAG) $(NVCCFLAGS) $(INCFLAGS) $(DEFINES) -o $$# -c $$<
$(call source-dir-to-build-dir, $(subst .cu,.cubin, $2)): $(call generated-source,$2)
$(NVCC) -cubin -Xptxas -v $(CUBIN_ARCH_FLAG) $(NVCCFLAGS) $(INCFLAGS) $(DEFINES) $(SMVERSIONFLAGS) -o $$# $$<
$(call source-dir-to-build-dir, $(subst .cu,.ptx, $2)): $(call generated-source,$2)
$(NVCC) -ptx $(CUBIN_ARCH_FLAG) $(NVCCFLAGS) $(INCFLAGS) $(DEFINES) $(SMVERSIONFLAGS) -o $$# $$<
endef
I've also included three functions for ease of use:
generated-source = $(filter %.cpp, $1) $(filter %.c, $1) $(filter %.f, $1) $(filter %.F, $1) $(filter %.cu, $1)
source-dir-to-build-dir = $(addprefix $(BUILDDIR)/, $1)
source-to-object = $(call source-dir-to-build-dir, \
$(subst .f,.o,$(filter %.f,$1)) \
$(subst .F,.o,$(filter %.F,$1)) \
$(subst .c,.o,$(filter %.c,$1)) \
$(subst .cpp,.o,$(filter %.cpp,$1)) \
$(if $(filter 1,$(USE_CUDA)),$(subst .cu,.cu.o,$(filter %.cu,$1))))
Then all you need to do is build up a list of source files and call:
$(foreach f,$(filter %.cu, $listOfFiles),$(call cuda-compile-rule,$(call source-to-object,$f),$f))
Note that in the function source-to-object there is a variable which I use to conditionally disable CUDA compilation USE_CUDA.