Failed to build CUDA kernels for upfirdn2d on Google colab - deep-learning

I am trying to experiment with a Deep learning framework called STIT(Stitch in Real Time), and it needs CUDA too. As I googled and found on Stackoverflow that you need to put run time on GPU in the case of google Colab and so I did. However, I get the error message
warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
100% 98/98 [00:45<00:00, 2.13it/s]
Which results in really slow training.
Any suggestions, What should I do here?

ok, I don't know why but the installation of the following solved the problem
!wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
!sudo unzip ninja-linux.zip -d /usr/local/bin/
!sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force
An explanation around it is appreciated

Related

CUDA 8 examples on Ubuntu 16 not finding libGLU, libGL, or libX11

I am using CUDA 8 and I am able to run some of the examples but I can not get any of the visualizations to run. I have gotten them to work in the past, but now I am not able to reproduce the results on the same computer with a fresh install. Mint or Ubuntu.
after a successful install of CUDA I try to make the particles or nbody samples but I get this error:
>>> WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - libX11.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
I looked through the Getting Started guide but have not found a solution.
I am systematically working through the symbolic links. perhaps someone here can offer a suggestion...
The result of a find request...
$ sudo find / -name 'libGLU*'
/usr/lib/i386-linux-gnu/libGLU.so.1.3.1
/usr/lib/i386-linux-gnu/libGLU.so.1
/usr/lib/x86_64-linux-gnu/libGLU.a
/usr/lib/x86_64-linux-gnu/libGLU.so.1.3.1
/usr/lib/x86_64-linux-gnu/libGLU.so.1
/usr/lib/x86_64-linux-gnu/libGLU.so
I have been trying to create symbolic links to the i386* and x86* libraries but havnt gotten it to work yet.
I am, for example, trying
sudo ln -s /usr/lib/i386-linux-gnu/libGLU.so /usr/lib/libGLU.so
My question now is, which libGLU.so do I need to point "/usr/lib/libGLU.so" to?
.a ?
.1?
.1.3.1?
x86 or i386? I know my system is 64bit but is CUDA expecting a 32bit library?
Doesn't seem like it should or would but... ?
I have tried the solutions on every SO and other board I can find... the two most relevant are
Cuda 6.5 cannot find - libGLU. (On ubuntu 14.04 64 bit)
and
http://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
which is where this question has existed previously.
It appears that the answer is in the link provided by Robert Crovella.
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libglfw3-dev libgles2-mesa-dev
then
GLPATH=/usr/lib make
instead of just make
source of solution
Thank you Robert.

No response when trying to install instrument-control package in octave

I am running raspbian on a Raspberry-Pi 2 B. I am trying to install the instrument-control package in octave 3.6.2.Initially I got a mkoctfile missing error:
Installing octave package in ubuntu http://ubuntuforums.org/showthread.php?t=955385
With 'sudo apt-get install octave-pkg-dev' this problem was solved.
However, when I try to install the instrument-control package in octave with: 'pkg install -forge instrument-control' nothing is happening, the command prompt underscore is just flickering as if it is processing something, but nothing happens for at least half an hour (I didn't try longer). I tried running octave as superuser, but the same thing happened.
I would very much appreciate if someone could help me out on this.
NB This is my first post here, and am not a super-experienced computerist so please tell me if I need to provide more information in any form.
It turned out that it actually takes longer than 30 minutes to compile all the necessary files, so octave is actually doing something. You just have to wait longer. When using -verbose this becomes more clear.
The downloading process might be hanging.
As a workaround, you can download the instrument-control package there.
Put it in your directory on the raspberry.
And then, in octave launched from the same directory, issue
pkg install instrument-control-0.2.1.tar.gz

no mpicxx when compiling examples for NVIDIA CUDA 5

I installed the driver and toolkit for CUDA 5 in 64-bit RHEL 6.3 successfully.
However, when I tried compiling the CUDA 5 examples, I got the error message:
make[1]: Leaving directory `/root/NVIDIA_CUDA-5.0_Samples/0_Simple/cppIntegration'
which: no mpicxx
How can I fix this for the CUDA 5 examples to compile?
In order to build the simpleMPI example, you need some kind of MPI installed on your system. You can get around this and build most of the samples by doing:
make -k
this will attempt to go past errors in the make process and build all targets that can be built.
If you prefer, you can delete this directory:
/root/NVIDIA_CUDA-5.0_Samples/0_Simple/simpleMPI
perhaps with the following command, as root:
rm -Rf /root/NVIDIA_CUDA-5.0_Samples/0_Simple/simpleMPI
and relaunch your make. Personally I think the make -k option is simpler.
(the message about cppIntegration is just the last target that got successfully built)

cuda-gdb exits with "[1] stopped" when it hits a kernel call

I'm pretty new to CUDA and flying a bit by the seat of my pants here...
I'm trying to debug my CUDA program on a remote machine I don't have admin rights on. I compile my program with nvcc -g -G and then try to debug it with cuda-gdb. However, as soon as gdb hits a call to a kernel (doesn't even have to enter it, and it doesn't happen in host code), I get:
(cuda-gdb) run
Starting program: /path/to/my/binary/cuda_clustered_tree
[Thread debugging using libthread_db enabled]
[1]+ Stopped cuda-gdb cuda_clustered_tree
cuda-gdb then dumps me back to my terminal. If I try to run cuda-gdb again, I get
An instance of cuda-gdb (pid 4065) is already using device 0. If you believe
you are seeing this message in error, try deleting /tmp/cuda-dbg/cuda-gdb.lock.
The only way to recover is to kill -9 cuda-gdb and cuda_clustered_ (I assume the latter is part of my binary).
This machine has two GPUs, is running CUDA 4.1 (I believe -- there were a lot installed, but that's the one I set the PATH and LD_LIBRARY_PATH to) and compile + runs deviceQuery and bandwidthTest fine.
I can provide more info if need be. I've searched everywhere I could find online and found no help with this.
Figured it out! Turns out, cuda-gdb hates csh.
If you are running csh, it will cause cuda-gdb to exhibit the above anomalous behavior. Even running bash from within csh, then running cuda-gdb, I still saw the behavior. You need to start your shell as bash, and only bash.
On the machine, the default shell was csh, but I use bash. I wasn't allowed to change it directly, so I added 'exec /bin/bash --login' to my .login script.
So even though I was running bash, because it was started by csh, cuda-gdb would exhibit the above anomalous behavior. Getting rid of 'exec' command, so I was running csh directly with nothing on top, still showed the behavior.
In the end, I had to get IT to change my shell to bash directly (after much patient troubleshooting by them.) Now it works as intended.

Error in Fedora Linux -- clock skew detected

How to solve the "Clock Skew detected. Your build may be incomplete" error in Fedora Linux?
I am getting this error while using the make command in the terminal.
Yes, definitely, cleaning up your codebase and then issuing
find . -exec touch {} \;
does the trick.
as the link said:
Sometimes the last modified time on the files is wrong:
because it is greater than the time of day clock. `make'
then issues the above message.
You usually see these sort of problems in programming
enviroments that use NFS to share files but don't sync
clocks using NTP.
Simple solution:
touch filename
will do all OK.
For more info:
http://embeddedbuzz.blogspot.in/2012/03/make-warning-clock-skew-detected-your.html
Have you seen Clock skew detected. Your build may be incompleted.?