CUDA + CMake target library dependence breaks on different machine - cuda

I recently tried to build my https://github.com/eyalroz/cuda-api-wrappers/ library's examples after switching to another Linux distribution on the same machine. Strangely enough, I encountered a linking issue. The command:
/usr/bin/c++ -Wall -std=c++11 -g CMakeFiles/device_management.dir/examples/by_runtime_api_module/device_management.cpp.o -o examples/bin/device_management -rdynamic lib/libcuda-api-wrappers.a -Wl,-Bstatic -lcudart_static -Wl,-Bdynamic -lpthread -ldl -lrt
fails to find the CUDA runtime library, and I get:
CMakeFiles/device_management.dir/examples/by_runtime_api_module/device_management.cpp.o: In function `cuda::device::peer_to_peer::get_attribute(cudaDeviceP2PAttr, int, int)':
/home/eyalroz/src/mine/cuda-api-wrappers/src/cuda/api/device.hpp:38: undefined reference to `cudaDeviceGetP2PAttribute'
collect2: error: ld returned 1 exit status
but if I add -L/usr/local/cuda/lib64 it builds fine. This didn't use to happen before; and it doesn't happen on another machine I've checked on, nor does it even happen to other targets using the CUDA runtime in the same CMakeLists.txt (like version_managament).
FindCUDA seems to be finding everything, as the value of ${CUDA_LIBRARIES} is /usr/local/cuda/lib64/libcudart_static.a;-lpthread;dl;/usr/lib/x86_64-linux-gnu/librt.so. And the target lines in CMakeLists.txt are:
add_executable(device_management EXCLUDE_FROM_ALL examples/by_runtime_api_module/device_management.cpp)
target_link_libraries(device_management cuda-api-wrappers ${CUDA_LIBRARIES})
as is suggested in answers to other related questions (e.g. here). Why is this happening? Should I "manually" add the -L switch?
Edit: Following #RobertCrovella's suggestion, here are the ld search paths:
$ gcc -print-search-dirs | sed '/^lib/b 1;d;:1;s,/[^/.][^/]*/\.\./,/,;t 1;s,:[^=]*=,:;,;s,;,; ,g' | tr \; \\012 | tr ':' "\n" | tail -n +3
/usr/local/cuda/lib64/x86_64-linux-gnu/5/
/usr/local/cuda/lib64/x86_64-linux-gnu/
/usr/local/cuda/lib/
/usr/lib/gcc/x86_64-linux-gnu/5/
/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/5/
/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/
/usr/x86_64-linux-gnu/lib/
/usr/lib/x86_64-linux-gnu/5/
/usr/lib/x86_64-linux-gnu/
/usr/lib/
/lib/x86_64-linux-gnu/5/
/lib/x86_64-linux-gnu/
/lib/
/usr/lib/x86_64-linux-gnu/5/
/usr/lib/x86_64-linux-gnu/
/usr/lib/
/usr/local/cuda/lib64/
/usr/x86_64-linux-gnu/lib/
/usr/lib/
/lib/
/usr/lib/
$ ld --verbose | grep SEARCH_DIR | tr -s ' ;' \\012
SEARCH_DIR("=/usr/local/lib/x86_64-linux-gnu")
SEARCH_DIR("=/lib/x86_64-linux-gnu")
SEARCH_DIR("=/usr/lib/x86_64-linux-gnu")
SEARCH_DIR("=/usr/local/lib64")
SEARCH_DIR("=/lib64")
SEARCH_DIR("=/usr/lib64")
SEARCH_DIR("=/usr/local/lib")
SEARCH_DIR("=/lib")
SEARCH_DIR("=/usr/lib")
SEARCH_DIR("=/usr/x86_64-linux-gnu/lib64")
SEARCH_DIR("=/usr/x86_64-linux-gnu/lib")
Notes:
Yes, I know the CMakeLists.txt there is ugly.

TL;DR:
After the FindCUDA invocation, add the lines:
get_filename_component(CUDA_LIBRARY_DIR ${CUDA_CUDART_LIBRARY} DIRECTORY)
set(CMAKE_EXE_LINKER_FLAGS ${CMAKE_EXE_LINKER_FLAGS} "-L${CUDA_LIBRARY_DIR}")
and building should succeed on both systems.
Discussion:
(Paraphrasing #RobertCrovella and myself in the comments:)
OP was expecting, that if the following hold:
FindCUDA succeeds
${CUDA_LIBRARIES} includes a valid full path to either the static or the dynamic CUDA runtime library
the library dependency is indicated using target_link_libraries(relevant_target ${CUDA_LIBRARIES})
... then the CMake-based build he was attempting should succeed on a variety of valid CUDA installations. That is (unfortunately) not the case, since while FindCUDA does locate the CUDA library path, it does not actually make your linker search that path. So a failure should actually be expected. The build had worked on OP's old system due to a "fluke", or rather, due to OP having added the CUDA library directory to the linker's search path, somehow, apriori.
The linking command must be issued with the -L/path/to/cuda/libraries switch, so that the linker knows where to looks for the (unspecified-path) libraries referred to be the CUDA-related -l switches (in OP's case, -lcudart_static).
This answer discusses how to do that in CMake for different kinds of targets. You might also want to have a look at man gcc (the GCC manual page, also available here) regarding the -l and -L options, if you are not familiar with them.

Related

libtool can't find the la when linking with option -L

I met the problem when compiling the source code of mpc which will depend on gmp.
The command to compile mpc is as below.
./configure --with-mpfr=/home/wy/tmp/mpfr-4.0.2/ins --with-gmp=/home/wy/tmp/gmp-6.2.0/ins --prefix=/home/wy/tmp/mpc-1.1.0/ins
The gmp has been installed into /home/user/tmp/gmp-6.2.0/ins successfully.
The error when compiling mpc with libtool is as below.
/bin/bash ../libtool --tag=CC --mode=link gcc -std=gnu99 -O2 -pedantic -fomit-frame-pointer -m64 -mtune=corei7 -march=corei7 -version-info 4:0:1 -L/home/wy/tmp/gmp-6.2.0/ins/lib -L/home/wy/tmp/mpfr-4.0.2/ins/lib -o libmpc.la -rpath /home/wy/tmp/mpc-1.1.0/ins/lib abs.lo acos.lo acosh.lo add.lo add_fr.lo add_si.lo add_ui.lo arg.lo asin.lo asinh.lo atan.lo atanh.lo clear.lo cmp.lo cmp_abs.lo cmp_si_si.lo conj.lo cos.lo cosh.lo div_2si.lo div_2ui.lo div.lo div_fr.lo div_ui.lo exp.lo fma.lo fr_div.lo fr_sub.lo get_prec2.lo get_prec.lo get_version.lo get_x.lo imag.lo init2.lo init3.lo inp_str.lo log.lo log10.lo mem.lo mul_2si.lo mul_2ui.lo mul.lo mul_fr.lo mul_i.lo mul_si.lo mul_ui.lo neg.lo norm.lo out_str.lo pow.lo pow_fr.lo pow_ld.lo pow_d.lo pow_si.lo pow_ui.lo pow_z.lo proj.lo real.lo rootofunity.lo urandom.lo set.lo set_prec.lo set_str.lo set_x.lo set_x_x.lo sin.lo sin_cos.lo sinh.lo sqr.lo sqrt.lo strtoc.lo sub.lo sub_fr.lo sub_ui.lo swap.lo tan.lo tanh.lo uceil_log2.lo ui_div.lo ui_ui_sub.lo -lmpfr -lmpfr -lgmp -lm
/bin/grep: /usr/local/lib/libgmp.la: No such file or directory
/bin/sed: can't read /usr/local/lib/libgmp.la: No such file or directory
libtool: error: '/usr/local/lib/libgmp.la' is not a valid libtool archive
Makefile:432: recipe for target 'libmpc.la' failed
make[2]: *** [libmpc.la] Error 1
make[2]: Leaving directory '/home/wy/tmp/mpc-1.1.0/src'
Makefile:465: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/wy/tmp/mpc-1.1.0'
Makefile:375: recipe for target 'all' failed
make: *** [all] Error 2
From the error message, we can see that the lib path has been indicated as -L/home/wy/tmp/gmp-6.2.0/ins/lib. But libtool still can't find the lib.
This is going to be hard to answer easily, but it seems like you have some other .la file that references it, and causes it to fail this way. Paths within .la files are always absolute. See this old blog post of mine for details.
My best guess for your particular case, is that you have an old copy of mpfr installed in /usr/local — and that is taking precedence over the one you want to use.
The problem appears to be in the MPFR configure scripts or makefile, not in MPC!
As you can see above, it's not looking for libgmp.la in the place you specified on the command-line, but in the default installation location. The reason for this is that the location of libgmp.la is specified incorrectly in libmpfr.la! It's not MPC's fault...
I was able to work around the problem by editing libmpfr.la, which you can find where you specified the MPFR libraries to go, and change the location to libgmp.la from /usr/lib/libgmp.la to the actual location where you targeted the GMP libraries.
The line in libgmp.la should look like this:
# Libraries that this one depends upon.
dependency_libs=' -L/your/mpfr/lib/target /your/gmp/lib/target/libgmp.la
Where "/your/mpfr/lib/target" should be where you told MPFR to put its library files, and "/your/gmp/lib/target" needs to be changed to where you told GMP to put its library files.

C code using gcc cannot link to mysql header?

I'd like to build on this post, because my symptoms are identical, but the solution seems to be something else.
I am working in a Ubuntu container (Ubuntu 16.04.3 LTS), trying to compile a toy C program that will eventually connect to an SQL server. First things first: I have the latest libmysqlclient-dev installed:
root#1234:/home# apt install libmysqlclient-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libmysqlclient-dev is already the newest version (5.7.24-0ubuntu0.16.04.1).
0 upgraded, 0 newly installed, 0 to remove and 88 not upgraded.
root#1234:/home#
And when I look in the /usr/include/mysql directory, I see the critical header file I'll need:
root#1234:/home# ls -l /usr/include/mysql | grep mysql.h
-rw-r--r-- 1 root root 29207 Oct 4 05:48 /usr/include/mysql/mysql.h
root#1234:/home#
So far, so good. Now I found this little toy program from here:
#include <stdio.h>
#include "/usr/include/mysql/mysql.h"
int main() {
MYSQL mysql;
if(mysql_init(&mysql)==NULL) {
printf("\nInitialization error\n");
return 0;
}
mysql_real_connect(&mysql,"localhost","user","pass","dbname",0,NULL,0);
printf("Client version: %s",mysql_get_client_info());
printf("\nServer version: %s",mysql_get_server_info(&mysql));
mysql_close(&mysql);
return 1;
}
Now, following the advice of the previous post, I compile using the "-I" option to point to the mysql.h header file. (Yes, I am required to use GCC here):
root#1234:/home# gcc -I/usr/include/mysql sqlToy.c
/tmp/cc8c5JmT.o: In function `main':
sqlToy.c:(.text+0x25): undefined reference to `mysql_init'
sqlToy.c:(.text+0x69): undefined reference to `mysql_real_connect'
sqlToy.c:(.text+0x72): undefined reference to `mysql_get_client_info'
sqlToy.c:(.text+0x93): undefined reference to `mysql_get_server_info'
sqlToy.c:(.text+0xb4): undefined reference to `mysql_close'
collect2: error: ld returned 1 exit status
root#1234:/home#
Belly flop! The compiler has no idea what all the mySQL functions are, even though I've done all I can think of to point to that header file. As a sanity check, I made sure those mySQL functions are indeed in that header file; here's one as an example:
root#1234:/home# more /usr/include/mysql/mysql.h | grep mysql_init
libmysqlclient (exactly, mysql_server_init() is called by mysql_init() so
MYSQL * STDCALL mysql_init(MYSQL *mysql);
root#1234:/home#
So while I've pointed the compiler to the mysql.h header file in both my code AND with the "-I" option, it still has no clue what those 'mysql' functions are. The "-I" option was the solution in the previous post, but is not working for me here.
So... What might be the problem? I'm assuming this isn't a compiling problem, but maybe a linking one? In other words, I'm showing GCC where the mysql.h file is, but he is still not using it when processing the code?
Headers are not libraries. You are including the MySQL header during compilation, so these functions are defined, but you are not linking against the library which actually provides those functions.
These functions are provided by the libmysqlclient library, so you need to add the -lmysqlclient flag to your command line to fix this. (Note that this is a lower-case l, not an I.)
Additionally, since you are adding /usr/include/mysql to your system header path, you can include the library as
#include <mysql.h>
You do not need to -- and should not! -- specify the full path to the library in the #include directive.

Caffe compilation fails: Undefined symbols for architecture x86_64?

If I try building the newest version of Caffe, it leads to this error:
$ make all
CXX/LD -o .build_release/tools/caffe.bin
clang: warning: argument unused during compilation: '-pthread'
Undefined symbols for architecture x86_64:
"caffe::Net<float>::Forward(float*)", referenced from:
test() in caffe.o
time() in caffe.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [.build_release/tools/caffe.bin] Error 1
I'm building on osx, OpenBLAS, and CPU_ONLY. I found a kind of similar issue on here but it appears to have been a resolved issue, and I'm not getting the exact same error, though perhaps it's related? I can also build and run an older version of Caffe from a month ago, so I think something has changed very recently.
Any ideas on how to overcome this error?
It shows a link problem when compile.On osx this problem will happen usually.I guess the problem happened in Makefile.config.You can change to
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
CUSTOM_CXX := g++
and confirm some path is correct.
I had exactly the same problem. Its now resolved.
Do check if you already have a libcaffe.so in your system library paths (maybe /usr/local/lib). If so, delete the existing libcaffe.so and build again.

CUDA 7.0 Error while compiling samples

I'm trying to install CUDA 7.0 on Ubuntu 14.04. I've followed the installation instructions as outlined here. Specifically, I've followed steps in section 3.6 and Chapter 6. While compiling the examples (Section 6.2.2.2) using make, I'm getting the following error:
make[1]: Entering directory `/usr/local/cuda-7.0/samples/3_Imaging/cudaDecodeGL'
/usr/local/cuda-7.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,
code=compute_20 -o cudaDecodeGL FrameQueue.o ImageGL.o VideoDecoder.o
VideoParser.o VideoSource.o cudaModuleMgr.o cudaProcessFrame.o
videoDecodeGL.o -L../../common/lib/linux/x86_64 -L/usr/lib/"nvidia-346"
-lGL -lGLU -lX11 -lXi -lXmu -lglut -lGLEW -lcuda -lcudart -lnvcuvid
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
make[1]: *** [cudaDecodeGL] Error 1
make[1]: Leaving directory `/usr/local/cuda-7.0/samples/3_Imaging/cudaDecodeGL'
make: *** [3_Imaging/cudaDecodeGL/Makefile.ph_build] Error 2
If you notice, there is -L/usr/lib/"nvidia-346". In my case, I have installed nvidia-349. What worked for me is to edit NVIDIA_CUDA-7.0_Samples/3_Imaging/cudaDecodeGL/findgllib.mk and change UBUNTU_PKG_NAME = "nvidia-346" to nvidia-349.
In order to properly install CUDA 7.0 on Ubuntu 14.04, you need a nvidia driver version 346 or higher.
If you're using the .deb installation method, the nvidia graphics driver is installed automatically.
If you used the .run file installation method and chose not to install the nvidia driver, you can manually install the driver afterwards through the package manager:
sudo apt-add-repository ppa:xorg-edgers/ppa && sudo apt-get update
sudo apt-get install nvidia-346 nvidia-346-dev nvidia-346-uvm libcuda1-346 nvidia-libopencl1-346 nvidia-icd-346
In my case, I installed nvidia-352 afterwards due to a bug in nvidia-346 and I stumbled upon the same error.
andoum's approach of manually changing the hard-coded UBUNTU_PKG_NAME = "nvidia-346" to UBUNTU_PKG_NAME = "nvidia-352" in NVIDIA_CUDA-7.0_Samples/3_Imaging/cudaDecodeGL/findgllib.mk worked fine for me.
I met the same issue and solution is that put path of nvidia into system path:
sudo gedit /etc/environment
add these path into environment
LIBRARY_PATH=/usr/lib/your_nvidia_edition:$LIBRARY_PATH
In fact I have encountered this problem when I made a make. I installed Cuda 8.0 under my Ubuntu 16.04. This problem had been confusing me for several weeks and I was almost tending to reinstall ubuntu for that after reviewing many suggestions via google, but finally I addressed it myself recently.
First of all, you should replace all the UBUNTU_PKG_NAME= ##nvidia-3xx## to the one of your actually installed nvidia driver version as recommended above. Then you will probably get compiling error after you do a new make. In my case, I have the link errors like
/usr/bin/ld: warning: libGLX.so.0, needed by /usr/lib/nvidia-
375/libGL.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libGLdispatch.so.0, needed by /usr/lib/nvidia-
375/libGL.so, not found (try using -rpath or -rpath-link)
....
or whatever contains missing link errors. Do locate the files you miss like
$ locate libGLX.so.
/usr/lib/nvidia-375/libGLX.so.0
/usr/lib32/nvidia-375/libGLX.so.0
$ locate libGLdispatch.so.0
/usr/lib/nvidia-375/libGLdispatch.so.0
/usr/lib32/nvidia-375/libGLdispatch.so.0
The error above is probably caused the compiling files cannot find in the default cuda libraries as you set, so you just need to copy the missing files to /usr/lib/nvidia-3xx/ (the actual path in your case) and this should work(it works in my case), if it doesn't maybe you could try to link the new add files to the one that need using a
$ sudo ln -s (requested file) (requesting file).
Hope this will help.

Cuda 5.0 Linking Issue

I'm just trying to build an old project of mine using cuda 5.0 preview.
I get an Error when linking, telling me that certain cuda functions can not be found. For example:
undefined reference to 'cudaMalloc'.
My linking command includes the following options for cuda :
-L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -lcudart
ls -lah /usr/local/cuda/lib64/ gives me 8 cuda libraries including libcudart.so.5.0.7 with symlinks using only the .so-file-ending.
ls /home/myhome/NVIDIA_CUDA_Samples/C/lib/ gives me an empty directory, which is kind of strange?
ls /home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux/ gives me two directories: i686 and x86_64 both containing only libGLEW.a
I have no idea which way to look for a solution. Any help is appreciated!
EDIT:
Here is my complete linking command (TARGET_APPLICATION is my binary and x86_64/Objectfiles.o stands for all (23) object files including the object file compiled with nvcc):
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src -I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -lcudart -o TARGET_APPLICATION x86_64/Objectfiles.o /usr/lib/liblapack.so /usr/lib/libblas.so /home/myhome/nullmpi-0.7/lib/libnullpmpi.a -lm
I use nullmpi for compilation and linking (project uses MPI and CUDA), which internally uses g++ as can be seen by -CC=g++, i wanted to keep this stuff out.
The compilation command for my cuda object file:
/usr/local/cuda/bin/nvcc -c -arch=sm_21 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -O3 kernelwrapper.cu -o x86_64/kernelwrapper.RELEASE.2.o
echo $LD_LIBRARY_PATH results in:
/usr/local/cuda/lib64:/usr/local/cuda/lib:
echo $PATH results in:
otherOptions:/usr/local/cuda/bin:/home/myhome/nullmpi-0.7/bin
I'm building 64-bit. For the sake of completeness I'm building on Ubuntu 12.04. (64bit).
Building the CUDA Samples works fine.
SOLUTION (thanks to talonmies for pointing me to it):
This is the correct linking command:
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src -I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 -L/home/myhome/NVIDIA_CUDA_Samples/C/lib -L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux -o TARGET_APPLICATION x86_64/Objectfiles.o /usr/lib/liblapack.so /usr/lib/libblas.so /home/myhome/nullmpi-0.7/lib/libnullpmpi.a -lcudart -lm
You have your linking statements in the incorrect order. It should be something more like this:
/home/myhome/nullmpi-0.7/bin/mpicxx -CC=g++ -I. -I/home/myhome/nullmpi-0.7/src \
-I/usr/lib/openmpi/include -L/usr/local/cuda/lib64 \
-L/home/myhome/NVIDIA_CUDA_Samples/C/lib \
-L/home/myhome/NVIDIA_CUDA_Samples/C/common/lib/linux \
-o TARGET_APPLICATION x86_64/Objectfiles.o \
/home/myhome/nullmpi-0.7/lib/libnullpmpi.a -llapack -lblas -lm -lcudart
The source of your problem is that you have specified the CUDA runtime library before the object file that contains a dependency to it. The linker simply discards libcudart.so from the linkage because there are no dependencies to it at the point when it is processed. Golden rule in POSIX style compilation statements: linkage statements are parsed left-to-right; so objects containing external dependencies first, libraries satisfying those dependencies afterwards.