no cuda compatible device detected on nsight eclipse. why? - cuda

i'm writing a simple code for fast fourier transform with cufft cuda library. My source file work well with visual studio in windows7 but with eclipse nsight, in ubuntu 14.04, not work!
i've installed nvidia 346.72 driver and cuda toolkit 7.0 and my video hardware is geforce 410M. When i build my source code i have following message:
16:56:24 **** Incremental Build of configuration Debug for project cufft_double ****
make all
Building target: cufft_double
Invoking: NVCC Linker
/usr/local/cuda-7.0/bin/nvcc --cudart static -L/usr/local/cuda-7.0/lib64 --relocatable-device-code=false -gencode arch=compute_20,code=compute_20 -gencode arch=compute_20,code=sm_20 -m64 -link -o "cufft_double" ./cufft_double.o
./cufft_double.o: In function `main':
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:79: undefined reference to `cufftPlan1d'
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:85: undefined reference to `cufftExecZ2Z'
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:108: undefined reference to `cufftDestroy'
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:111: undefined reference to `cufftPlan1d'
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:117: undefined reference to `cufftExecZ2Z'
/home/marco/cuda-workspace/cufft_double/Debug/../cufft_double.cu:136: undefined reference to `cufftDestroy'
collect2: error: ld returned 1 exit status
make: *** [cufft_double] Error 1
16:56:27 Build Finished (took 2s.792ms)
i tried to set library path but in preferences windows i read "no CUDA-compatible devices detected"
please help me!
Best reguards
marco
now i can build source code but my program not work!
i read this error:
modprobe: ERROR: could not insert 'nvidia_331_uvm': Invalid argument
and i receive a message programmed by me if "cudaGetLastError() != cudaSuccess"
after "cudaMalloc"
For best clarification i read "cuda error: allocazione fallita" for this frame of code:
cudaMalloc((void**)&out_device, sizeof(cufftDoubleComplex)*NX*BATCH);
if (cudaGetLastError() != cudaSuccess){
printf("Cuda error: allocazione fallita\n");
return 0;
};

Run these commands in sequence:
sudo apt-get remove --purge nvidia-*
sudo apt-get install cuda-drivers
sudo apt-get install nvidia-nsight
Restart the machine and open nsight and look at the properties whether you see the detected driver.

Related

Segmentation fault when compiling Darknet for GPU

I want to compile the Darknet framework for machine learning on my PC with GPU support. However I call make I will get a segmentation fault:
nvcc -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=[sm_50,compute_50] -gencode arch=compute_52,code=[sm_52,compute_52] -Iinclude/ -Isrc/ -DOPENCV `pkg-config --cflags opencv` -DGPU -I/usr/local/cuda/include/ --compiler-options "-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DOPENCV -DGPU" -c ./src/convolutional_kernels.cu -o obj/convolutional_kernels.o
Segmentation fault (core dumped)
Makefile:92: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 139
nvidia-smi gives me following information:
NVIDIA-SMI 418.87.01 Driver Version: 418.87.01 CUDA Version: 10.1
When I do nvcc --version I get:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
The CUDA Version 10.1 is not the same as the Verions 9.1 of the Cuda compilation tools. Could this be the problem? NVCC is installed via apt install nvidia-cuda-toolkit
Just gonna post my solution here because I figured out the actual reason for this. So the reason this happens is because it's running a different binary than the actual one darknet wants to run. At least for me, which nvcc gave me /usr/bin/nvcc. The actual nvcc you want is located in /usr/local/cuda-11.1/bin (version number might be different obviously). So all you need to do is prepend (important!) that directory to your PATH variable.
export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}} >> ~/.bashrc
Source:https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions
I recommend you follow the link because there are a couple more mandatory post-installation steps that I also did not follow.
I solved the problem. After installing cuda the actual binary of nvcc is at /usr/local/cuda/bin/nvcc. Creating a symbolic link in /usr/bin/ to this binary solved the problem.
Another approach is to edit the Makefile and set the correct nvcc.
In my case:
line 24 replace
NVCC=nvcc
to
NVCC=/usr/local/cuda-11.0/bin/nvcc
Note that the cuda version may vary.

got error again when I install rpy2 (compiled by llvm in brew)

When I ran 'pip install rpy2', it reported error:
clang: error: unsupported option '-fopenmp'
I installed llvm with brew since it supports the option. But I have to link brew's llvm (clang-5.0) to clang command. Then I ran 'pip install rpy2', now I got new error. How do I fix it?
ld: library not found for -lomp
clang-5.0: error: linker command failed with exit code 1 (use -v to see invocation)
error: command '/usr/local/bin/clang' failed with exit status 1
There an open issue on the rpy2 tracker with a similar error message, and an active discussion where some are reporting success compiling rpy2 after the combination of dev tools provided by Apple and requirements to compile the latest R release became incompatible:
https://bitbucket.org/rpy2/rpy2/issues/403/cannot-pip-install-rpy2-with-latest-r-340

Error compiling caffe on Mac OS

I compiled caffe successfully at first and tried the MNIST example. But there is something wrong when I tried to install pycaffe. So I reinstalled caffe. However I received compiling error this time.
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1
After I input the command 'clang -v'
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir:/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
I figured out it was the problem between two version of python. The system default version and the anaconda one. You have to make sure the compiling and executing using the same one.

Compile error during Caffe installation on OS X 10.11

I've configured Caffe environment on my Mac for several times. But this time I encountered a problem I've never met before:
I use Intel's MKL for accelerating computation instead of ATLAS, and I use Anaconda 2.7 and OpenCV 2.4, with Xcode 7.3.1 on OS X 10.11.6.
when I
make all -j8
in terminal under Caffe's root directory, the error info is:
AR -o .build_release/lib/libcaffe.a
LD -o .build_release/lib/libcaffe.so.1.0.0-rc5
clang: warning: argument unused during compilation: '-pthread'
ld: can't map file, errno=22 file '/usr/local/cuda/lib' for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [.build_release/lib/libcaffe.so.1.0.0-rc5] Error 1
make: *** Waiting for unfinished jobs....
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: file: .build_release/lib/libcaffe.a(parallel.o) has no symbols
I've tried many times, does anyone can help me out?
This looks like you haven't changed Makefile.config from GPU to CPU mode. There shouldn't be anything trying to actively link that library. I think the only CUDA one you should need is libicudata.so
Look for the lines
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
and remove the octothorpe from the front of the second line.

CUDA 7.0 Error while compiling samples

I'm trying to install CUDA 7.0 on Ubuntu 14.04. I've followed the installation instructions as outlined here. Specifically, I've followed steps in section 3.6 and Chapter 6. While compiling the examples (Section 6.2.2.2) using make, I'm getting the following error:
make[1]: Entering directory `/usr/local/cuda-7.0/samples/3_Imaging/cudaDecodeGL'
/usr/local/cuda-7.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,
code=compute_20 -o cudaDecodeGL FrameQueue.o ImageGL.o VideoDecoder.o
VideoParser.o VideoSource.o cudaModuleMgr.o cudaProcessFrame.o
videoDecodeGL.o -L../../common/lib/linux/x86_64 -L/usr/lib/"nvidia-346"
-lGL -lGLU -lX11 -lXi -lXmu -lglut -lGLEW -lcuda -lcudart -lnvcuvid
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
make[1]: *** [cudaDecodeGL] Error 1
make[1]: Leaving directory `/usr/local/cuda-7.0/samples/3_Imaging/cudaDecodeGL'
make: *** [3_Imaging/cudaDecodeGL/Makefile.ph_build] Error 2
If you notice, there is -L/usr/lib/"nvidia-346". In my case, I have installed nvidia-349. What worked for me is to edit NVIDIA_CUDA-7.0_Samples/3_Imaging/cudaDecodeGL/findgllib.mk and change UBUNTU_PKG_NAME = "nvidia-346" to nvidia-349.
In order to properly install CUDA 7.0 on Ubuntu 14.04, you need a nvidia driver version 346 or higher.
If you're using the .deb installation method, the nvidia graphics driver is installed automatically.
If you used the .run file installation method and chose not to install the nvidia driver, you can manually install the driver afterwards through the package manager:
sudo apt-add-repository ppa:xorg-edgers/ppa && sudo apt-get update
sudo apt-get install nvidia-346 nvidia-346-dev nvidia-346-uvm libcuda1-346 nvidia-libopencl1-346 nvidia-icd-346
In my case, I installed nvidia-352 afterwards due to a bug in nvidia-346 and I stumbled upon the same error.
andoum's approach of manually changing the hard-coded UBUNTU_PKG_NAME = "nvidia-346" to UBUNTU_PKG_NAME = "nvidia-352" in NVIDIA_CUDA-7.0_Samples/3_Imaging/cudaDecodeGL/findgllib.mk worked fine for me.
I met the same issue and solution is that put path of nvidia into system path:
sudo gedit /etc/environment
add these path into environment
LIBRARY_PATH=/usr/lib/your_nvidia_edition:$LIBRARY_PATH
In fact I have encountered this problem when I made a make. I installed Cuda 8.0 under my Ubuntu 16.04. This problem had been confusing me for several weeks and I was almost tending to reinstall ubuntu for that after reviewing many suggestions via google, but finally I addressed it myself recently.
First of all, you should replace all the UBUNTU_PKG_NAME= ##nvidia-3xx## to the one of your actually installed nvidia driver version as recommended above. Then you will probably get compiling error after you do a new make. In my case, I have the link errors like
/usr/bin/ld: warning: libGLX.so.0, needed by /usr/lib/nvidia-
375/libGL.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libGLdispatch.so.0, needed by /usr/lib/nvidia-
375/libGL.so, not found (try using -rpath or -rpath-link)
....
or whatever contains missing link errors. Do locate the files you miss like
$ locate libGLX.so.
/usr/lib/nvidia-375/libGLX.so.0
/usr/lib32/nvidia-375/libGLX.so.0
$ locate libGLdispatch.so.0
/usr/lib/nvidia-375/libGLdispatch.so.0
/usr/lib32/nvidia-375/libGLdispatch.so.0
The error above is probably caused the compiling files cannot find in the default cuda libraries as you set, so you just need to copy the missing files to /usr/lib/nvidia-3xx/ (the actual path in your case) and this should work(it works in my case), if it doesn't maybe you could try to link the new add files to the one that need using a
$ sudo ln -s (requested file) (requesting file).
Hope this will help.