cannot open /lib/ld-linux-aarch64.so.1 in qemu or gem5 - qemu

I am trying to simulate a simple Hello world ARM example on my desktop computer. I tried both qemu and gem5. Both gives a similar error. They cannot find ld-linux-aarch64.so.1. Actually I cannot find it either. If I could find it, I will show it with -L (in qemu) or --redirects (in gem5).
The file is:
armhello: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=23a21b7a545ac510923b6b3713d2bbee092f820a, for GNU/Linux 3.7.0, not stripped
It is compiled with: aarch64-linux-gnu-gcc
I am trying to run it in qemu with:
qemu-aarch64 armhello
I got this error:
/lib/ld-linux-aarch64.so.1: No such file or directory
I try to run it in gem5 with: (simpleARM.py points to my executable (named as armhello))
build/ARM/gem5.opt configs/tutorial/simpleARM.py
I got this error:
panic: panic condition fd < 0 occurred: Failed to open file /lib/ld-linux-aarch64.so.1.
How can I solve this?
Note: I know it works when compiled --static. But I need to run more complex binaries that are dynamically linked and I cannot change those. This is just an example.

For gem5 you can use --redirects and --interp-dir: How to run a dynamically linked executable syscall emulation mode se.py in gem5?
For qemu you need -L: Using dynamic linker with qemu-arm

same problem on x86_64 machine docker build with an arm64 docker image:
FROM multiarch/qemu-user-static:x86_64-aarch64 as qemu
FROM alpine
COPY --from=qemu /usr/bin/qemu-aarch64-static /usr/bin/
# add this line to resolve
RUN apk add libc6-compat

Related

How to specify include directory of mpicxx in the command line option of make?

I am trying to build all CUDA samples by running make in the sample's base folder. One of the samples require mpi.h, but the system did not have it, which causes an error:
make[1]: Entering directory '$HOME/cuda_samples/samples/0_Simple/simpleMPI'
/bin/mpicxx -I../../common/inc -o simpleMPI_mpi.o -c simpleMPI.cpp
simpleMPI.cpp:25:10: fatal error: mpi.h: No such file or directory
25 | #include <mpi.h>
| ^~~~~~~
compilation terminated.
make[1]: *** [Makefile:371: simpleMPI_mpi.o] Error 1
Since I don't have root privilege, I downloaded a deb file for libopenmpi-dev package (using apt-get download command) and extracted it to somewhere in my user space (using dpkg -x command). However, as we can see, mpicxx tries to find mpi.h in ../../common/inc, which is not where I installed libopenmpi-dev in my user space (I did not notice that untill I installed the package. My bad). So I need to somehow tell mpicxx to find mpi.h in another directory. I know there is a -I option to tell make where additional include directories are, but this option does not apply to mpicxx. How to pass directory information from make's command line to mpicxx is beyond my knowledge. Can you please teach me what option I should use in make's command line to specify include directory used by mpicxx? Of course I can manually copy the installed libopenmpi-dev package to ../../common/inc to accommodate original settings in CUDA sample, but I would like to do something cool and learn something new, so I ask here. Thank you in advance for teaching me.
Environment:
Remote Linux with core version 5.8.0. I am not a super user.
CUDA version: 11.2
CPU: Intel Core i9-10900K
gcc version: (Ubuntu 10.2.0-13ubuntu1) 10.2.0
make version: GNU Make 4.3, Built for x86_64-pc-linux-gnu
MPI version: 4.0.3
The include directory in Makefile is held in a variable INCLUDES, together with -I. So, if we can somehow transfer the include directory of mpi installed in my user space to this variable, we are done. So, the question is reduced to how to transfer a user-defined value from make's command line into Makefile's variable and override it if it has be defined, as is clearly asked in the question.
Fortunately make provides this command line option: VAR=value, so the option to answer my question is
make INCLUDES=-I/path/to/mpi/include/in/my/user/space

Unable to get cuda to work in tensorflow

I'm trying to use cuda to accelerate tensorflow. I'm running tensorflow using the docker image.
Firstly, when I launch the gpu image, it has a mismatch in the LT_LIBRARY_PATH environment variable:
~# echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64:
root#d578acbbc2cd:~# ls /usr/local/
bin cuda cuda-7.0 etc games include lib man sbin share src
There's no nvidia directory there. When I try to run the convolutional.py demo, it can't initialise the cuda support:
# python models/image/mnist/convolutional.py
Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 8
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.2.0-23-generic/modules.dep.bin'
E tensorflow/stream_executor/cuda/cuda_driver.cc:466] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:98] retrieving CUDA diagnostic information for host: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:106] hostname: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:131] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:242] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.68 Tue Dec 1 17:24:11 PST 2015
GCC version: gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2)
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:135] kernel reported version is: 352.68
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA:
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 8
It then goes on to train using cpu only.
# find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
So in the docker image, there's only the gnu cpu cuda implementation. No NVIDIA stuff. In the host ubuntu 15.10 session, I have libcuda.so installed:
$ find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/lib/i386-linux-gnu/libcuda.so
/usr/local/cuda-7.5/targets/x86_64-linux/lib
/stubs/libcuda.so
So these seem to be stubs ... not sure why.
Is there some trick to getting this to work?
Try rebuilding the Docker image directly from the Tensorflow repository (i.e. don't rely on the image on the container registry) and use https://github.com/NVIDIA/nvidia-docker to run the container (the Docker command described in the Tensorflow documentation is not portable).
I had a similar problem, though not in docker. The libcuda.so in /usr/local/cuda/lib64/stubs was a broken sym link. When I searched for libcuda.so it only turned up a file in a lib32 folder.
It seems that the problem was how I originally installed the NVIDIA device driver. At some point in the driver install process you're given the option to install the lib32 drivers. I had thought this meant in addition to lib64 drivers so I selected it. Turns out it only installs lib32 and not lib64 drivers.
I reinstalled the NIVDIA device driver, this time not selecting the lib32 'option'. Now tensorflow finds libcuda.so.
I had the same problem with running tensorflow on a Ubuntu machine after I upgraded my driver to 352.63 and 352.93. (I remember it works with 346.* but when I try to install 346., it installs 352. automatically for some reason).
I finally figured out that it's caused by permission issue. (I can run it with root) So, I changed the permission of the libcuda.so.352-63 file to executable by anyone and it works well now.
Hope this will be helpful to those still struggling with this issue.
I didn't try the docker one, but I guess it's also caused by permission setting.
Try this command
sudo apt-get install nvidia-modprobe
As mentioned here:
https://github.com/tensorflow/tensorflow/issues/394
and
http://kkjkok.blogspot.in/2016_08_01_archive.html
After I updated NVIDIA driver to 378.09 on Ubuntu 14.10 I had the same error,
although all the right for lib files were set correctly.
Thanks to #PhoenixQ, I tried to run with sudo and it worked.
After that I tried to run without sudo one more time and error disappeared. I'm not sure what ecxactly happened, but maybe something was configured during call with sudo, which was not possible withous sudo.
So the solution:
Try to run the same thing with sudo.
After this. Tryu running without sudo. Worked for me.

Binary file refuses to run due to a missing shared library

I tried building recutils version 1.7 downloaded from the home page, using the standard configure, make, sudo make install sequence, but when trying to run the resulting binaries. like recinf, I get the error:
recinf: error while loading shared libraries: librec.so.1: cannot open shared object file: No such file or directory
Does this mean I made a mistake during the build or is the package itself in error?
As Etan Reisner said the problem was that the shared object libraries were installed but not loaded into the cache, hence the need to run ldconfig. After running
sudo ldconfig
the binaries ran properly. If I had looked in /usr/local/lib, I would have seen the libs there.

Perl and DBD::mysql Can't load mysql.so... Perhaps a required shared library or dll isn't installed where expected

Running this code on a shared host with a locally installed perl and modules which were installed via perlbrew. It worked fine for several weeks. One day, it started dying with this output:
/home/xxxx/perl5/perlbrew/perls/perl-5.16.2/bin/perl tweet.pl
install_driver(mysql) failed: Can't load '/home/xxxx/perl5/perlbrew/perls/perl-5.16.2/lib/site_perl/5.16.2/x86_64-linux/auto/DBD/mysql/mysql.so' for module DBD::mysql: libmysqlclient.so.15: cannot open shared object file: No such file or directory at /home/xxxx/perl5/perlbrew/perls/perl-5.16.2/lib/5.16.2/x86_64-linux/DynaLoader.pm line 190.
at (eval 27) line 3.
Compilation failed in require at (eval 27) line 3.
Perhaps a required shared library or dll isn't installed where expected
at subroutines.pm line 3.
The code hasn't changed. The way I run the script hasn't changed, either. Since I am running this one a shared host, I have no idea what might have been updated or changed on the server, but perl is installed to my home directory, as are all the modules I am using.
It looks like a problem with libmysqlclient. What distribution are you running?
If you are running Debian(based), try "sudo apt-get purge libmysqlclient libmysqlclient-dev" and then "sudo apt-get install libmysqlclient libmysqlclient-dev".

no mpicxx when compiling examples for NVIDIA CUDA 5

I installed the driver and toolkit for CUDA 5 in 64-bit RHEL 6.3 successfully.
However, when I tried compiling the CUDA 5 examples, I got the error message:
make[1]: Leaving directory `/root/NVIDIA_CUDA-5.0_Samples/0_Simple/cppIntegration'
which: no mpicxx
How can I fix this for the CUDA 5 examples to compile?
In order to build the simpleMPI example, you need some kind of MPI installed on your system. You can get around this and build most of the samples by doing:
make -k
this will attempt to go past errors in the make process and build all targets that can be built.
If you prefer, you can delete this directory:
/root/NVIDIA_CUDA-5.0_Samples/0_Simple/simpleMPI
perhaps with the following command, as root:
rm -Rf /root/NVIDIA_CUDA-5.0_Samples/0_Simple/simpleMPI
and relaunch your make. Personally I think the make -k option is simpler.
(the message about cppIntegration is just the last target that got successfully built)