I am getting this warning during compilation of a C code with OpenMP directives on Linux:
warning: ignoring #pragma omp parallel
Gcc version is 4.4.
Is it only a warning I should not care about? Will the execution be in parallel?. I would like a solution with a some explanation.
I have provide -fopenmp with the make command, but gcc doesn't accept that, otherwise for single compilation of file, i.e. gcc -fopenmp works alright.
IIRC you have to pass -fopenmp to the g++ call to actually enable OpenMP. This will also link against the OpenMP runtime system.
Make sure that lib-gomp and lib-gomp-dev is installed. In some strange distributions it is removed. It is the essential runtime and development library.
This is probably a resolved/closed issue, because indeed the most common reason for this warning is the omission of the -fopenmp flag.
However, when I came across this problem the root cause for this was that the module openmp was not loaded, meaning:
load module openmp.
Related
CUDA kernels are launched with this syntax (at least in the runtime API)
mykernel<<<blocks, threads, shared_mem, stream>>>(args);
Is this implemented as a macro or is it special syntax that nvcc removes before handing host code off to gcc?
The nvcc preprocessing system eventually converts it to a sequence of CUDA runtime library calls before handing the code off to the host code compiler for compilation. The exact sequence of calls may change depending on CUDA version.
You can inspect files using the --keep option to nvcc (and --verbose may help with understanding as well), and you can also see a trace of API calls issued for a kernel call using one of the profilers e.g. nvprof --print-api-trace ...
---EDIT---
Just to make this answer more concise, nvcc directly modifies the host code to replace the <<<...>>> syntax before passing it off to the host compiler (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#offline-compilation)
I've linked all the required libraries and the caffee confige ran smoothly. But when I want to make the library I get this error:
/usr/bin/ld: /usr/local/lib/libgflags.a(gflags.cc.o): relocation R_X86_64_32S against `std::basic_string, std::allocator >::_Rep::_S_empty_rep_storage' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libgflags.a: could not read symbols: Bad value
I found a 'workaround' for this problem at the libgflags and glog troubleshooting websites:
https://code.google.com/p/google-glog/issues/detail?id=201
But I tried them and it doesn't seem to work. Am I missing something? Maybe I haven't uncommented a line in my original Makefile.config file? *I am installing caffe on my laptop with no CUDA or parallel computing for now.
Try recompiling the gflags library with -fPIC compiler flag.
Did the caffe work using gflags shared library instead of using the static one?
Try to choose 'BUILD SHARED LIBS' option when using Cmake
I am on Ubuntu 12.04 LTS and have installed CUDA 5.5. I understand that without any CUDA/GPGPU elements in the code, nvcc behaves as a C/C++ compiler -- more like gcc, however is there any exception to this rule ? if not, then can I use nvcc as gcc for non-CUDA C/C++ codes ?
No, nvcc doesn't behave like a C/C++ compiler for host code. What it does is the following:
separate device from host code into two separate files
compile device code (with nvcc, cudafe, ptxas)
invoke gcc for host code
If no device code exists, nothing is done in steps 1) and 2). So nvcc is actually no compiler, it is a compiler driver which invokes the right compilers for every part in the right order. To answer your question, if you use nvcc to compile host code only, you still use gcc.
It doesn't accept options to suppress warnings ( -W*)
I am running a cuda project. But somehow I am not able to set the flag -arch=sm_20 in the
sconscript file which has been written by someone else. I need to use printf in kernel for debugging and I have little experience of sconscript python.
The specifics depend on the way you have SCons set up to work with CUDA. I use these scripts: http://github.com/BryanCatanzaro/cuda-scons
With this setup, all you need to do is invoke SCons with your preferred architecture:
scons arch=sm_20
And nvcc will be invoked with the -arch=sm_20 flag.
Details of your setup may be different, but if you look through your SCons script, you should see how to change this flag.
with a very simple code, hello world, the breakpoint is not working.
I can't write the exact comment since it's not written in English,
but it's like 'the symbols of this document are not loaded' or something.
there's not cuda codes, just only one line printf in main function.
The working environment is windows7 64bit, vc++2008 sp1, cuda toolkit 3.1 64bits.
Please give me some explanation on this. :)
So this is just a host application (i.e. nothing to do with CUDA) doing printf that you can't debug? Have you selected "Debug" as the configuration instead of "Release"?
Are you trying to use a Visual Studio breakpoint to stop in your CUDA device code (.cu)? If that is the case, then I'm pretty sure that you can't do that. NVIDIA has released Parallel NSIGHT, which should allow you to do debugging of CUDA device code (.cu), though I don't have much experience with it myself.
Did you compile with -g -G options as noted in the documentation?
NVCC, the NVIDIA CUDA compiler driver, provides a mechanism for generating the debugging information necessary for CUDA-GDB to work properly. The -g -G option pair must be passed to NVCC when an application is compiled for ease of debugging with CUDA-GDB; for example,
nvcc -g -G foo.cu -o foo
here: https://docs.nvidia.com/cuda/cuda-gdb/index.html