AssertionError: Torch not compiled with CUDA enabled still cant solve

AssertionError: Torch not compiled with CUDA enabled still cant solve - deep-learning

I checked everything on my torch and cuda. They are all installed correctly, and cuda is return TRUE.
As you can see my torch version is 1.11.0 and cuda version is 11.3.0
I use the code from this article https://github.com/wanyu-lin/ICML2021-Gem
I also see others post possible solutions that
delete all .cuda() in code
or
change os.environ["CUDA_VISIBLE_DEVICES"] = "1"
to os.environ["CUDA_VISIBLE_DEVICES"] = "0"
or
device = "cpu"
But all of them do not work.
Can someone help me? I have been stuck on this problem for 2 days and do nothing else. Really thank you for any suggestion!

Related

How to fix Octave producing wrong results with Intel MKL in Ubuntu?

Although Intel MKL speeds up calculations in GNU Octave, the results are sometimes (tested with Octave 5.2.0 in Xubuntu 20.04) totally wrong when the size of the Matrices are big. This has been mentioned here and here.
For example, this gist shows an example that Octave and Scilab produce different results, and the Octave's result is wrong (it changes every time the script is run. Octave gives correct result with OpenBLAS).
This is the code in the gist.
for a = 1:500
for b = 1:500
c(a,b) = sin(a + b^2);
endfor
endfor
g = eig(c);
m = max(real(g))
%Correct result is ans = 16.915
%With MKL in Ubuntu 20.04, I get random numbers of order 10^5 - 10^6, which changes on every run
How to fix this issue?

This issue has been mentioned in some Debian bug reports (see this and this), and a bug report in Octave.
According to the Debian maintainers, this is neither a bug of Octave, nor of MKL. It arises due to a racing condition between libgomp and libiomp.
Here's how to fix it.
Enter the command
export MKL_THREADING_LAYER=gnu
in a terminal, and call octave from the same terminal. Now the issue should not arise.
To make this fix permanent, add the line export MKL_THREADING_LAYER=gnu to your .bashrc file.
Note: After installing MKL, I plotted a certain graph, and found something was seriously wrong (although the calculations were faster).
I posted it in the MKL community, and they said it's not their bug. Finally I opened a bug report with Octave and someone mentioned this workaround.
Warning: As mentioned in the bug report, the test suite of Octave (__run_test_suite__) fails with segmentation fault even when this workaround is applied. Therefore, it is advised that Octave with MKL is used with caution.

Craycc equivalent for "mcmodel"

I am trying to compile a C code under craycc. Compilation fails with the error "relocation truncated to fit: R_X86_64_32". Under Intel or GNU I can get past this error with the "mcmodel" flag. But craycc does not recognize this flag. Does anyone know of an equivalent flag/approach under craycc? I looked in the craycc man page, but couldn't find any discussion of this issue.

In case anyone is ever interested, the flag -h pic gets the Cray compiler past the problem and produces a running executable. I don't know how exact the equivalence is between this flag and the -mcmodel=medium flag, but that's what was needed to solve this particular problem.

INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal

I am trying to use the tensorboard callback in keras. When I run the pretrained inceptionv3 model with the tensorboard callback I am getting the following warning:
INFO:tensorflow:Summary name conv2d_95/kernel:0 is illegal; using conv2d_95/kernel_0 instead.
I saw a comment on Github addressing this issue. SeaFX on his comment pointed out that he solved it by replacing variable.name with variable.name.replace(':','_'). I am unsure how to do that. Can anyone please help me. Thanks in advance :)

Not sure on getting name replacement to work however a workaround that may be sufficient for your needs is:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.WARN)
import keras
This will turn off all INFO level logging but keep warnings, errors etc.
See this question for a discussion on the various log levels and changing them. Personally I found setting the TF_CPP_MIN_LOG_LEVEL environment variable didn't work under Jupyter notebook but I haven't tested on base Python.

Octave and MatConvNet integration

Does anyone ever succeeded in installing MAtConvNet under Octave ?
If so could you please let me know the steps to proceed ?
thanks and regards
Arno

I was just looking into this issue myself. I have reached a point in researching this where I feel the issues are too complicated for my own project and are not worth my time trying to finish running down. However, if someone else is determined to track this down, hopefully this information will help.
The basic problem comes down Octave only compiling to support 32-bit architectures even if you use the 64-bit installer. If you want Octave to support 64-bit, you need to compile from source using the appropriate compiling options. The other details are as follows.
MatConvNet appears to require a 64-bit system to compile.
http://www.vlfeat.org/matconvnet/mfiles/vl_compilenn/
MatConvNet detects system architecture in in the mex_cuda_config function in vl_compilenn.m:
https://github.com/vlfeat/matconvnet/blob/master/matlab/vl_compilenn.m
Octave's computer function is not a perfect analog to Matlab's function, so the mex_cuda_config function in vl_compilenn.m would need to be modified or Octave's computer function would need to be updated. More specifically, the computer function's handling of the 'arch' argument needs to be changed.
There may be other issues, but this is where I would start if I had the time to invest in trying to track this down.

Linking to specific glibc version in Cython

I have a Cython extension which I've compiled on Ubuntu 14 and uploaded as an Anaconda package. I'm trying to install the package on another machine which is running Scientific Linux (6?) which ships with an older version of glibc. When I try to import the module I get an error that looks (something like) this:
./myprogram: /lib/libc.so.6: version `GLIBC_2.14' not found (required by ./myprogram)
When I say "something like" - the "myprogram" is actually the .so name of the extension.
From what I understand this error is because I have a newer version of glibc on the build system which has an updated version of the memcpy function.
This page has a good description of the problem, and some rather impractical solutions: http://www.lightofdawn.org/wiki/wiki.cgi/NewAppsOnOldGlibc
There is also a much simpler answer proposed here: How can I link to a specific glibc version?
My question is: how to I apply this solution to my Cython extension? Assuming the __asm__ solution works (as given in the second link) what's the best way to get it into the C generated by Cython?
Also, more generally, how to other modules avoid this issue in the first place? For example, I installed and ran a pre-built copy of numpy without any issues.

This turned out to be quite simple.
Create the following header, glibc_fix.h:
__asm__(".symver memcpy,memcpy#GLIBC_2.2.5")
Then include it by using CFLAGS="-include glibc_fix.h". This can be set as an environment variable, or defined in setup.py.
Additionally, it turns out numpy doesn't do anything special in this regard. if I compile it myself it links with the newer version on my system.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

AssertionError: Torch not compiled with CUDA enabled still cant solve - deep-learning

Related

How to fix Octave producing wrong results with Intel MKL in Ubuntu?

Craycc equivalent for "mcmodel"

INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal

Octave and MatConvNet integration

Linking to specific glibc version in Cython

Categories

Resources