My purpose of using LAPACK is to calculate the cholesky of a matrix. I am programming in C/C++ in Fedora, but I am confused over which lapack to install - LAPACK with lapacke or clapack?
The basic difference between the two is the need for a Fortran compiler.
CLAPACK is basically just the reference NETLIB LAPACK routines passed through the old f2c converter, allowing the library to be compiled with a C compiler.
LAPACKE is an attempt (started by Intel IIRC) to define a formal C language interface for Fortran LAPACK libraries. It has the advantage that it is LAPACK implementation independent and will hide toolchain specific C to Fortran interoperability so that the programmer doesn't have to worry about them. LAPACKE also has the distinct advantage of working correctly with the C99 complex intrinsic type.
I would not expect a major performance difference between the two (the choice of BLAS dictates most of that), but I would probably favor LAPACKE + the LAPACK and BLAS implmementation of choice, if I were to start from scratch today.
Related
Before I dive too deep into CUDA programming I need to orient myself. The NVIDIA CUDA programming guides made a distinct change from referring to "CUDA C" to "CUDA C++" between versions 10.1 and 10.2. Since this was a minor version change, I suspect it is just semantics. I compared sample code from pre-10.1 and post-10.2 and found no difference...though that doesn't mean there is no difference. Was there a more subtle programming paradigm shift between these versions?
Here's my suspicion: CUDA has always been an extension of C++, not C, but everyone has referred to it as CUDA C because the we don't take advantage of the OOP offered by C++ when writing CUDA code. Is that a fair assessment?
I think your assessment is reasonable conjecture. People are sometimes imprecise in their references to C and C++, and so CUDA probably hasn't been very rigorous here either. There is some history though that suggests to me this is not purely hand-waving.
CUDA started out as largely a C-style realization, but over time added C++ style features. Certainly by CUDA 4.0 (circa 2010) if not before, there were plenty of C++ style features.
Lately, CUDA drops the reference to C but claims compliance to a particular C++ ISO standard, subject to various enumerated restrictions and limitations.
The CUDA compiler, nvcc, behaves by default like a C++ style compiler (so, for example, using C++ style mangling), and will by default invoke the host-code C++ compiler (e.g. g++) not the host code C compiler (e.g. gcc) when passing off host code to be compiled.
As you point out, a programmer can use the C++ language syntactically in a very similar way to C usage (e.g. without the use of classes, to pick one example). This is also true for CUDA C++.
It's not possible to build Rome in a day, and so CUDA development has proceeded in various areas at various rates. For example, one stated limitation of CUDA is that elements of the standard library (std::) are not necessarily supported in device code. However various CUDA developers are working to gradually fill in this gap with the libcu++ evolution.
Is there any compiled language that has garbage collection built in?
To my understanding right now, the purpose of an interpreter or JVM is to make binaries platform independent. Is it also because of the GC? Or is GC possible in compiled code?
SML, OCaml, Eiffel, D, Go, and Haskell are all statically-typed languages with garbage collection that are typically compiled ahead of time to native code.
As you correctly point out, virtual machines are mostly used to abstract away machine-dependent properties of underlying platforms. Garbage collection is an orthogonal technology. Usually it is not mandatory for a language, but is considered a desired property of a run-time environment. There are indeed languages with primitives to allocate memory (e.g., new in Java and C#) but without primitives to release it. They can be thought of as languages with built-in GC.
One such programming language is Eiffel. Most Eiffel compilers generate C code for portability reasons. This C code is used to produce machine code by a standard C compiler. Eiffel implementations provide GC (and sometimes even accurate GC) for this compiled code, and there is no need for VM. In particular, VisualEiffel compiler generated native x86 machine code directly with full GC support.
Garbage collection is possible in compiled languages.
The Boehm GC is a well known garbage collector for C & C++ - Wikipedia article
Another example is the D programming language has garbage collection
https://nim-lang.org
Nim language has some progress and has good portability as uses C(++), JS & ObjectiveC code generation
I want to start CUDA in C++ and I familiar with C++ , Qt and C#.
But i want to know it's better to use from CUDA libraries -at high level- or CUDA API s -at the lower level- ?
Is it more better that I'm starting from API and dont use of CUDA driver ?
(I start on "cuda by example" for its concepts in parallel)
Since you are familiar with C/C++, you'd better use the higher-level API, CUDA C or C for CUDA, which is more convenient to easy to write, because it consists of a minimal set of extensions to the C language and a runtime library.
The lower-level API, which is the CUDA driver API that provides an additional level of control by exposing lower-level concepts, requires more code, is harder to program and debug, but offers a better level of control and is language-independent since it handles binary or assembly code.
See Chapter 3 of CUDA programming guide for more details.
This is a bit of silly question, but I'm wondering if CUDA uses an interpreter or a compiler?
I'm wondering because I'm not quite sure how CUDA manages to get source code to run on two cards with different compute capabilities.
From Wikipedia:
Programmers use 'C for CUDA' (C with Nvidia extensions and certain restrictions), compiled through a PathScale Open64 C compiler.
So, your answer is: it uses a compiler.
And to touch on the reason it can run on multiple cards (source):
CUDA C/C++ provides an abstraction, it's a means for you to express how you want your program to execute. The compiler generates PTX code which is also not hardware specific. At runtime the PTX is compiled for a specific target GPU - this is the responsibility of the driver which is updated every time a new GPU is released.
These official documents CUDA C Programming Guide and The CUDA Compiler Driver (NVCC) explain all the details about the compilation process.
From the second document:
nvcc mimics the behavior of the GNU compiler gcc: it accepts a range
of conventional compiler options, such as for defining macros and
include/library paths, and for steering the compilation process.
Not just limited to cuda , shaders in directx or opengl are also complied to some kind of byte code and converted to native code by the underlying driver.
I had done my physics simulation project using C++ , OpenGL in Visual Studio 10. Later I had used OpenMP for CPU Parallelization. Now I want to accelerate my C++ code to CUDA so that I can achieve higher performance. Is it possible to convert my code into CUDA or any GPU devices?
Cuda and C++ are different programming languages (even if they look syntactically similar) with different programming paradigm.
You'll have to recode, and perhaps even redesign, your project to take advantage of Cuda (or of OpenCL).
Actually, you'll need to define what are the numerical kernels that might take advantage of your GPGPU and then recode these kernels (in Cuda, or in OpenCL); you'll also have to write some glue code to make all this work together.
You can determine which parts of your project can be parallelized and then reimplement these parts in Cuda. You can take a look at Fast N-Body Simulation with CUDA.