interfacing g77 and free pascal - freepascal

I have forced a free-pascal object to almost-work with a g77 main. The problem is that the g77 loader does not load a pascal library, so any attempt to insert writeln in the pascal code fails because FPC-IOCHECK, fpc_get_output .. are not found. Without I/O it works. Anyone knows a fix of the form -lPASCAL? There is a 30-year history that justifies this step; I understand perfectly that the request should seem foolish.
thanks.
I work with UBUNTU 20.4
g77 -v says
Configured with: ../src/configure -v --enable-languages=c,c++,f77,pascal --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug x86_64-linux-gnu
Thread model: posix
gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu5)
There are three units:
main.f:
10 WRITE (6,'(A,$)') '0 to stop>'
READ (5,*) I
IF (I.EQ.0) STOP
CALL INFACE (I)
GOTO 10
STOP
END
SUBROUTINE PRINTER (I,R)
INTEGER*4 I
REAL*8 R
WRITE (6,*) 'received',I,R
RETURN
END
inter.c
include <stdio.h>
extern void SUB_$$_PROC1_$SMALLINT();
extern void SUB_$$_PRINTNOW_$SMALLINT$REAL();
extern void printer_();
extern void printnow();
extern void inface_(int*);
void inface_(i)
int *i; { int j; SUB_$$_PROC1_$SMALLINT(i); }
void SUB_$$_PRINTNOW_$SMALLINT$REAL(i,r)
int i; double r; {printf("route 3.0.4 "); printer_(&i,&r);}
void printnow_(i,r)
int i; double r; {printf("route 3.2.0 "); printer_(&i,&r);}
sub.p:
unit sub;
interface
procedure PROC1_(var i:integer);
procedure printnow_(i:integer; r:real); external;
implementation
procedure PROC1_(var i:integer);
var r:real;
begin r:=2*sqrt(i); printnow_(i,r); {writeln('a');} end;
end.
I have compiled with
fpc -Un sub.p (either versions 3.0.4 and 3.2.0)
g77 main.f inter.c sub.o
All works with writeln commented, the link fails otherwise.
Note that the C-routine called is different with fpc-3.0.4 and fpc-3.2.0.

Related

How to develop tool in C/C++ whose command interface is Tcl shell?

Suppose a tool X need to developed which are written in C/C++ and having Tcl commanline interface, what will the steps or way?
I know about Tcl C API which can be used to extend Tcl by writing C extension for it.
What you're looking to do is embedding Tcl (totally a supported use case; Tcl remembers that it is a C library) but still making something tclsh-like. The simplest way of doing this is:
Grab a copy of tclAppInit.c (e.g., this is the current one in the Tcl 8.6 source tree as I write this) and adapt it, probably by putting the code to register your extra commands, linked variables, etc. in the Tcl_AppInit() function; you can probably trim a bunch of stuff out simply enough. Then build and link directly against the Tcl library (without stubs) to get effectively your own custom tclsh with your extra functionality.
You can use Tcl's API more extensively than that if you're not interested in interactive use. The core for non-interactive use is:
// IMPORTANT: Initialises the Tcl library internals!
Tcl_FindExecutable(argv[0]);
Tcl_Interp *interp = Tcl_CreateInterp();
// Register your custom stuff here
int code = Tcl_Eval(interp, "your script");
// Or Tcl_EvalFile(interp, "yourScriptFile.tcl");
const char *result = Tcl_GetStringResult(interp);
if (code == TCL_ERROR) {
// Really good idea to print out error messages
fprintf(stderr, "ERROR: %s\n", result);
// Probably a good idea to print error traces too; easier from in Tcl
Tcl_Eval(interp, "puts stderr $errorInfo");
exit(1);
}
// Print a non-empty result
if (result[0]) {
printf("%s\n", result);
}
That's about all you need unless you're doing interactive use, and that's when Tcl_Main() becomes really useful (it handles quite a few extra fiddly details), which the sample tclAppInit.c (mentioned above) shows how to use.
Usually, SWIG (Simplified Wrapper and Interface Generator) is the way to go.
SWIG HOMEPAGE
This way, you can write code in C/C++ and define which interface you want to expose.
suppose you have some C functions you want added to Tcl:
/* File : example.c */
#include <time.h>
double My_variable = 3.0;
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
int my_mod(int x, int y) {
return (x%y);
}
char *get_time()
{
time_t ltime;
time(&ltime);
return ctime(&ltime);
}
Now, in order to add these files to your favorite language, you need to write an "interface file" which is the input to SWIG. An interface file for these C functions might look like this :
/* example.i */
%module example
%{
/* Put header files here or function declarations like below */
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
%}
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
At the UNIX prompt, type the following:
unix % swig -tcl example.i
unix % gcc -fpic -c example.c example_wrap.c \
-I/usr/local/include
unix % gcc -shared example.o example_wrap.o -o example.so
unix % tclsh
% load ./example.so example
% puts $My_variable
3.0
% fact 5
120
% my_mod 7 3
1
% get_time
Sun Feb 11 23:01:07 2018
The swig command produces a file example_wrap.c that should be compiled and linked with the rest of the program. In this case, we have built a dynamically loadable extension that can be loaded into the Tcl interpreter using the 'load' command.
Taken from http://www.swig.org/tutorial.html

CUDA: Forgetting kernel launch configuration does not result in NVCC compiler warning or error

When I try to call a CUDA kernel (a __global__ function) using a function pointer, everything appears to work just fine. However, if I forget to provide launch configuration when calling the kernel, NVCC will not result in an error or warning, but the program will compile and then crash if I attempt to run it.
__global__ void bar(float x) { printf("foo: %f\n", x); }
typedef void(*FuncPtr)(float);
void invoker(FuncPtr func)
{
func<<<1, 1>>>(1.0);
}
invoker(bar);
cudaDeviceSynchronize();
Compile and run the above. Everything will work just fine. Then, remove the kernel's launch configuration (i.e., <<<1, 1>>>). The code will compile just fine but it will crash when you try to run it.
Any idea what is going on? Is this a bug, or I am not supposed to pass around pointers of __global__ functions?
CUDA version: 8.0
OS version: Debian (Testing repo)
GPU: NVIDIA GeForce 750M
If we take a slightly more complex version of your repro, and look at the code emitted by the CUDA toolchain front-end, it becomes possible to see what is happening:
#include <cstdio>
__global__ void bar_func(float x) { printf("foo: %f\n", x); }
typedef void(*FuncPtr)(float);
void invoker(FuncPtr passed_func)
{
#ifdef NVCC_FAILS_HERE
bar_func(1.0);
#endif
bar_func<<<1,1>>>(1.0);
passed_func(1.0);
passed_func<<<1,1>>>(2.0);
}
So let's compile it a couple of ways:
$ nvcc -arch=sm_52 -c -DNVCC_FAILS_HERE invoker.cu
invoker.cu(10): error: a __global__ function call must be configured
i.e. the front-end can detect that bar_func is a global function and requires launch parameters. Another attempt:
$ nvcc -arch=sm_52 -c -keep invoker.cu
As you note, this produces no compile error. Let's look at what happened:
void bar_func(float x) ;
# 5 "invoker.cu"
typedef void (*FuncPtr)(float);
# 7 "invoker.cu"
void invoker(FuncPtr passed_func)
# 8 "invoker.cu"
{
# 12 "invoker.cu"
(cudaConfigureCall(1, 1)) ? (void)0 : (bar_func)((1.0));
# 13 "invoker.cu"
passed_func((2.0));
# 14 "invoker.cu"
(cudaConfigureCall(1, 1)) ? (void)0 : passed_func((3.0));
# 15 "invoker.cu"
}
The standard kernel invocation syntax <<<>>> gets expanded into an inline call to cudaConfigureCall, and then a host wrapper function is called. The host wrapper has the API internals required to launch the kernel:
void bar_func( float __cuda_0)
# 3 "invoker.cu"
{__device_stub__Z8bar_funcf( __cuda_0); }
void __device_stub__Z8bar_funcf(float __par0)
{
if (cudaSetupArgument((void *)(char *)&__par0, sizeof(__par0), (size_t)0UL) != cudaSuccess) return;
{ volatile static char *__f __attribute__((unused)); __f = ((char *)((void ( *)(float))bar_func));
(void)cudaLaunch(((char *)((void ( *)(float))bar_func)));
};
}
So the stub only handles arguments and launches the kernel via cudaLaunch. It doesn't handle launch configuration
The underlying reason for the crash (actually an undetected runtime API error) is that the kernel launch happens without a prior configuration. Obviously this happens because the CUDA front end (and C++ for that matter) can't do pointer introspection at compile time and detect that your function pointer is a stub function for calling a kernel.
I think the only way to describe this is a "limitation" of the runtime API and compiler. I wouldn't say what you are doing is wrong, but I would probably be using the driver API and explicitly managing the kernel launch myself in such a situation.

package require with static lib

I am working on app which uses tcl package implemented in C++ and linked as static library (app is developed long time ago). It does following:
// Library code
extern "C" int testlib_SafeInit _ANSI_ARGS_((Tcl_Interp *interp))
{
return Tcl_PkgProvide(interp, "testlib", "1.6");
}
extern "C" int testlib_Init _ANSI_ARGS_((Tcl_Interp *interp))
{
return testlib_SafeInit(interp);
}
// Application code
extern "C" int testlib_SafeInit _ANSI_ARGS_((Tcl_Interp *interp));
extern "C" int testlib_Init _ANSI_ARGS_((Tcl_Interp *interp));
int main()
{
Tcl_Interp* interp = Tcl_CreateInterp();
Tcl_Init(interp);
Tcl_PkgProvide(interp, "testlib", "1.6");
Tcl_StaticPackage(interp, "testlib", testlib_Init, testlib_SafeInit);
Tcl_Eval(interp, "package require testlib");
std::cout << "Res = " << Tcl_GetStringResult(interp);
return 0;
}
When I am removing line Tcl_PkgProvide(interp, "testlib", "1.6"); from main, package becomes invisible. Also I have noticed that testlib_Init and testlib_SafeInit are not called. I am expecting that they must be called from package require testlib. As I understand from docs each package must have pkgIndex.tcl in auto_path or tcl_pkgPath which must contain line
(package ifneeded testlib 1.6 {load {} testlib}), but here both variables does not contain such index file.
Is this a correct way of providing packages? Is there a documentation related with providing packages using static libraries?
Well, the simplest technique for statically providing a package is to just install it directly. The package init code should be the one calling Tcl_PkgProvide — you don't do so from main() usually — and you probably don't need Tcl_StaticPackage at all unless you're wanting to install the code into sub-interpreters.
int main(int argc, char*argv[])
{
Tcl_FindExecutable(argv[0]);
Tcl_Interp* interp = Tcl_CreateInterp();
Tcl_Init(interp);
testlib_Init(interp);
// OK, setup is now done
Tcl_Eval(interp, "package require testlib");
std::cout << "Res = " << Tcl_GetStringResult(interp) << "\n";
return 0;
}
However, we can move to using Tcl_StaticPackage. That allows code to say “instead of loading a DLL with this sort of name, I already know that code: here are its entry points”. If you are doing that, you need to also install a package ifneeded script; those are done through the script API only.
int main(int argc, char*argv[])
{
Tcl_FindExecutable(argv[0]);
Tcl_Interp* interp = Tcl_CreateInterp();
Tcl_Init(interp);
Tcl_StaticPackage(interp, "testlib", testlib_Init, testlib_SafeInit);
Tcl_Eval(interp, "package ifneeded testlib 1.6 {load {} testlib}");
// OK, setup is now done
Tcl_Eval(interp, "package require testlib");
std::cout << "Res = " << Tcl_GetStringResult(interp) << "\n";
return 0;
}
The testlib in the load call needs to match the testlib in the Tcl_StaticPackage call. The testlib in the package require, package ifneeded and Tcl_PkgProvide also need to all match (as do the occurrences of 1.6, the version number).
Other minor issues
Also, you don't need to use the _ANSI_ARGS_ wrapper macro. That's utterly obsolete, for really ancient and crappy compilers that we don't support any more. Just replace _ANSI_ARGS_((Tcl_Interp *interp)) with (Tcl_Interp *interp). And remember to call Tcl_FindExecutable first to initialise the static parts of the Tcl library. If you don't have argv[0] available to pass into it, use NULL instead; it affects a couple of more obscure introspection systems on some platforms, but you probably don't care about them. However, initialising the library overall is very useful: for example, it lets you make sure that the filesystem's filename encoding scheme is correctly understood! That can be a little important to code…

Is there an API to get a more precise time for profiling including operating system?

In the Flash API there is the getTimer() method that returns the time in milliseconds since the SWF started. I was writing PHP recently and there is a much more precise method called, microtime() that returns the current Unix timestamp in microseconds. Does Flash have anything more precise than getTimer()?
Update:
If Flash hasn't yet added a more precise profile API is it possible to use ExternalInterface or make a process call to maybe get an operating system API? I have both desktop and browser applications and in desktop I can access native processes.
Note: This answer is in response to the comment thread
This is a cli cmd to access the clock_gettime() on Linux and clock_get_time() on OS-X. I use it on OS-X with a couple of bash scripts, but you can call it using native processes with Air
>> ./nsTime ; ./nsTime
s: 1446019237
ns: 99241000
s: 1446019237
ns: 104217000
Compile:
OS-X :
gcc -o nsTime nsTime.c
Linux:
gcc -o nsTime nsTime.c -lrt
C Code:
#include <time.h>
#include <sys/time.h>
#include <stdio.h>
#ifdef __MACH__
#include <mach/clock.h>
#include <mach/mach.h>
#endif
void current_utc_time(struct timespec *ts) {
#ifdef __MACH__ // OS X does not have clock_gettime, use clock_get_time
clock_serv_t cclock;
mach_timespec_t mts;
host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock);
clock_get_time(cclock, &mts);
mach_port_deallocate(mach_task_self(), cclock);
ts->tv_sec = mts.tv_sec;
ts->tv_nsec = mts.tv_nsec;
#else
clock_gettime(CLOCK_REALTIME, ts);
#endif
}
int main(int argc, char **argv) {
struct timespec ts;
current_utc_time(&ts);
printf("s: %lu\n", ts.tv_sec);
printf("ns: %lu\n", ts.tv_nsec);
return 0;
}
Disclaimer: This code came from the Intertubes, not sure of its original author
Take a look at Jackson Dunstan's sub-millisecond Timer that he wrote about in 2013. I haven't tried it myself, so I can't say if it's any good, but it tries to address your issues with the standard AS3 Timer.

Loading multiple modules in JCuda is not working

In jCuda one can load cuda files as PTX or CUBIN format and call(launch) __global__ functions (kernels) from Java.
With keeping that in mind, I want to develop a framework with JCuda that gets user's __device__ function in a .cu file at run-time, loads and runs it.
And I have already implemented a __global__ function, in which each thread finds out the start point of its related data, perform some computation, initialization and then call user's __device__ function.
Here is my kernel pseudo code:
extern "C" __device__ void userFunc(args);
extern "C" __global__ void kernel(){
// initialize
userFunc(args);
// rest of the kernel
}
And user's __device__ function:
extern "C" __device__ void userFunc(args){
// do something
}
And in Java side, here is the part that I load the modules(modules are made from ptx files which are successfully created from cuda files with this command: nvcc -m64 -ptx path/to/cudaFile -o cudaFile.ptx)
CUmodule kernelModule = new CUmodule(); // 1
CUmodule userFuncModule = new CUmodule(); // 2
cuModuleLoad(kernelModule, ptxKernelFileName); // 3
cuModuleLoad(userFuncModule, ptxUserFuncFileName); // 4
When I try to run it I got error at line 3 : CUDA_ERROR_NO_BINARY_FOR_GPU. After some searching I get that my ptx file has some syntax error. After running this suggested command:
ptxas -arch=sm_30 kernel.ptx
I got:
ptxas fatal : Unresolved extern function 'userFunc'
Even when I replace line 3 with 4 to load userFunc before kernel I get this error. I got stuck at this phase. Is this the correct way to load multiple modules that need to be linked together in JCuda? Or is it even possible?
Edit:
Second part of the question is here
The really short answer is: No, you can't load multiple modules into a context in the runtime API.
You can do what you want, but it requires explicit setup and execution of a JIT linking call. I have no idea how (or even whether) that has been implemented in JCUDA, but I can show you how to do it with the standard driver API. Hold on...
If you have a device function in one file, and a kernel in another, for example:
// test_function.cu
#include <math.h>
__device__ float mathop(float &x, float &y, float &z)
{
float res = sin(x) + cos(y) + sqrt(z);
return res;
}
and
// test_kernel.cu
extern __device__ float mathop(float & x, float & y, float & z);
__global__ void kernel(float *xvals, float * yvals, float * zvals, float *res)
{
int tid = threadIdx.x + blockIdx.x * blockDim.x;
res[tid] = mathop(xvals[tid], yvals[tid], zvals[tid]);
}
You can compile them to PTX as usual:
$ nvcc -arch=sm_30 -ptx test_function.cu
$ nvcc -arch=sm_30 -ptx test_kernel.cu
$ head -14 test_kernel.ptx
//
// Generated by NVIDIA NVVM Compiler
//
// Compiler Build ID: CL-19324607
// Cuda compilation tools, release 7.0, V7.0.27
// Based on LLVM 3.4svn
//
.version 4.2
.target sm_30
.address_size 64
// .globl _Z6kernelPfS_S_S_
.extern .func (.param .b32 func_retval0) _Z6mathopRfS_S_
At runtime, your code must create a JIT link session, add each PTX to the linker session, then finalise the linker session. This will give you a handle to a compiled cubin image which can be loaded as a module as usual. The simplest possible driver API code to put this together looks like this:
#include <cstdio>
#include <cuda.h>
#define drvErrChk(ans) { drvAssert(ans, __FILE__, __LINE__); }
inline void drvAssert(CUresult code, const char *file, int line, bool abort=true)
{
if (code != CUDA_SUCCESS) {
fprintf(stderr, "Driver API Error %04d at %s %d\n", int(code), file, line);
exit(-1);
}
}
int main()
{
cuInit(0);
CUdevice device;
drvErrChk( cuDeviceGet(&device, 0) );
CUcontext context;
drvErrChk( cuCtxCreate(&context, 0, device) );
CUlinkState state;
drvErrChk( cuLinkCreate(0, 0, 0, &state) );
drvErrChk( cuLinkAddFile(state, CU_JIT_INPUT_PTX, "test_function.ptx", 0, 0, 0) );
drvErrChk( cuLinkAddFile(state, CU_JIT_INPUT_PTX, "test_kernel.ptx" , 0, 0, 0) );
size_t sz;
char * image;
drvErrChk( cuLinkComplete(state, (void **)&image, &sz) );
CUmodule module;
drvErrChk( cuModuleLoadData(&module, image) );
drvErrChk( cuLinkDestroy(state) );
CUfunction function;
drvErrChk( cuModuleGetFunction(&function, module, "_Z6kernelPfS_S_S_") );
return 0;
}
You should be able to compile and run this as posted and verify it works OK. It should serve as a template for a JCUDA implementation, if they have JIT linking support implemented.