In C, if I want to see a function that how to work, I open the library which provides the function and analyze the code. How can be implementations of the lisp functions seen? For example, intersection function
You can also look at the source code of lisp functions.
For example, the source files for CLISP, one Common Lisp implementation, are available here: http://www.clisp.org/impnotes/src-files.html
If you want to examine the implementation of functions related to lists, you can look at the file: http://clisp.cvs.sourceforge.net/viewvc/clisp/clisp/src/list.d
The usual answer is "M-."
Assuming you have a properly configured IDE, and the source code of the function, clicking on its name and pressing M-. (that's Meta, or Alt or Option or Escape, and dot/period; or whatever key your IDE uses) should reveal its definition (or, for a generic function, definitions, plural; including any compiler macros that might optimize out some cases). Sometimes it's on a right-click or other mouse menu or toolbar.
If the source isn't available, you can often see the actual compiled form by evaluating (disassemble 'function)
Most IDE's, including perennial favourite Emacs+Slime, have other Inspection operations on the menu as well.
In a non-IDE environment, most compilers have reflection tools of their own (compiler-dependant) which are usually also mapped by the Swank library that Slime uses; one might find useful function in that package.
And this really should be documented in your IDE's manual.
I should postscript this that:
You really shouldn't care about the implementation of the core library functions; their contractual behavior is very well documented in the CLHS standard, which is available online and eg, Quicklisp has an utility to link it to Slime (C-c C-d h on a symbol in the COMMON-LISP package); for all well-written Lisp libraries, there should be documentation attached to functions, variables, classes, etc. accessible via the documentation function in the REPL or the IDE's menus and Inspection windows.
Core library functions are often highly optimized and far more complex than most user-level code should want to be, and often call down into compiler-specific "guts" that one should avoid doing in application code.
Related
I was looking for get and put to be available in ISOIO mode but they are not. What is enabled with ISOIO?
The details are undocumented, it seems like it mostly a subset of {$mode iso} for reading/writing files, that reroutes RTL handlers for reset/read/write to the ones for $mode ISO, and limits the types allowed for read/write in text mode.
It also enables look ahead with filetype^. (which is probably the reason why there are _ISO specific handlers in the first place, together with the ISO form of the RESET() statement) and variables
of ISO filetypes seem to be initialized. (under some circumstances)
I don't see enabling of get/put, but I'm no compiler crack, so I might have missed that. You can test that yourself. (whoops on rereading your post, you already did).
So I think the answer is primarily lookahead with the ^ operator.
**added later response from Pascaldragon **
A Pascal developer more into dialectal items finally reacted, which I quote here verbatim:
Put and Get are not part of modeswitch ISOIO, because they're not intrinsics and are instead provided by the ISO7185 unit which is only used for modeswitch ISO. As that unit also contains functionality that's not covered by the ISOIO modeswitch (some types, Round functions) it's not used for that modeswitch, but only together with the mode.
So basically the implementation is a library thing, and can't easily be decoupled from the other library based ISO stuff.
I'm new to Julia, and I'm trying to understand, at the language level, what ccall is. At the syntax level, it looks like a normal function, but it clearly doesn't behave the same way in how it takes its arguments:
Note that the argument type tuple must be a literal tuple, and not a
tuple-valued variable or expression.
Additionally, if I evaluate a variable bound to a function in the Julia REPL, I get something like
julia> max
max (generic function with 15 methods)
But if I try to do the same with ccall:
julia> ccall
ERROR: syntax: invalid "ccall" syntax
Clearly, ccall is a special piece of syntax, but it's also not a macro (no # prefix, and invalid macro usage gives a more specific error). So, what is it? Is it something baked into the language, or something I could define myself with some language construct I'm not familiar with?
And if it is some baked-in piece of syntax, why was it decided to use function call notation, instead of implementing it as a macro or designing a more readable and distinct syntax?
In the current nightly (and thus, upcoming 0.6 release), much of the special behavior you observe has been removed (see this pull-request). ccall is no longer a reserved word, so it can be used as a function or macro name.
However there is still a slight oddity: defining a 3- or 4-argument function called ccall is allowed, but actually calling such a function will give an error about ccall argument types (other numbers of arguments are ok). The reasons go directly to your question:
So, what is it? Is it something baked into the language
Yes, ccall, though it will no longer be a keyword in 0.6, is still "baked in" to the language in several ways:
the :ccall([four args...]) expression form is recognized and specially handled during syntax lowering. This lowering step does several things including wrapping arguments in a call to unsafe_convert, which allows for customized conversion from Julia objects to C-compatible objects; as well as pulling out arguments that might need to be rooted to prevent garbage collection of a referenced object during the ccall. (see code_lowered output, or try the expand function; more info on the compiler here).
ccall requires extensive handling in the code generation backend, including: look-up of the requested function name in the specified shared library, and generation of an LLVM call instruction -- which is eventually translated to platform-specific machine code by the LLVM Just-In-Time compiler. (see the different stages with code_llvm and code_native).
And if it is some baked-in piece of syntax, why was it decided to use
function call notation, instead of implementing it as a macro or
designing a more readable and distinct syntax?
For the reasons detailed above, ccall requires special handling whether it looks like a macro or a function. In this mailing list thread, one of the Julia creators (Stefan Karpinski) commented on why not to make it a macro:
I suppose we could reimplement it as a macro, but that would really just be pushing the magic further down.
As far as "a more readable and distinct syntax", perhaps that is a matter of taste. It's not clear to me why some other syntax would be preferable (except for the convenience of a LuaJIT/CFFI-style inline C syntax parsing, of which I am a fan). My only strong personal wish for ccall would be to have arguments and types entered adjacent (e.g. ccall((:foo, :libbar), Void, (x::Int, y::Float))), because working with longer argument lists can be inconvenient. In 0.6 it will be possible to implement this form as a macro!
In Julia 0.5 and earlier.
It is not a function and it is not a macro.
It is indeed something special baked into the language.
It is an Intrinsic.
In julia 0.6 this changes
It a lot of ways it is more like a Macro than a function call.
But in other ways it is not -- it does not return an AST.
It does call a function and on a low enough level it looks similar to calling a julia function.
The history of why it looks the way it does is beyond me, you'ld need to hear from one of the people who worked on the earliest code for the language.
Right now it is everywhere, and is one of the harder things to change -- but not impossible. It would trigger up for 3 years of bikeshedding though :-P .
I like to think of ccall as being two things.
Foreign Function Interface, for C and other compiled languages (eg Fortran, Rust apparently work)
Way to access the raw guts of the language "runtime".
Foreign Function Interface (FFI)
Most of the time when one uses ccall in a package one wants to invoke some code that is in a compile library. In this sense it is C-Call, like R-Call, or Py-Call.
I think mlewe/BlossomV.jl is a nice compact example.
For a more intense example oxinabox/SLEEF.jl.
As an FFI, it does not have to share memory space/a process with julia -- PyCall.jl does, RCall.jl and Matlab.jl don't.
It doesn't matter as long as the result comes back.
In these cases it is theoretically possible to replace ccall with some kind of safe_ccall which would run the called library in a separate process, and would not segfault julia if the library being called segfaulted.
But as of yet, no-one has written such a method/package.
Using ccall for FFI is even done in Base, like for accessing MPFR to define BigFloat.
But this is not the main reason ccall is used in Base.
Accessing the guts of the language.
ccall is really what drives a large portion of the program "doing a thing".
It is used throughout Base, to call the functions from src.
For this, ccall basically triggers a function call at the compiled level, that shifts the instruction pointer directly into the compiled code of the ccalled function. Like calling a function would if the whole thing had been written in say C.
You can see in base/threadingconstructs.jl ccall being used to manage work on threads -- that triggers code from src/threading.c.
It is used to map a section of disk to memory. mmap.jl. -- obviously can't be done from another process.
It is used to make a section of code non-intruptable
It is used call LibC to do things like malloc to allocate memory (though right now this is mostly used as part of FFI).
There are tricks you can do with ccall to #undef a variable after it has already been assigned.
ccall is in many ways the "master" key to the language.
Conclusion
I've described ccall here as two things, a FFI function and a core part of the language "runtime". This duality is not real, and there is plenty of overlap, like filehandling (is it FFI?).
The behavour many expect ccall to have comes from its FFI uses.
Here ccall could just be a function.
The behaviour it actually has comes from it's use as a core part of the language -- linking the julia code of the standard library in Base to the low level C code from src.
Allowing the very direct control over the running of the julia process.
I discovered that a lot of "special forms" are just macros that use their asterisks version in the background (fn*, let* and all the others).
In case of fn, for example, it adds the destructuring capability into the mix which fn* alone does not provide. I tried to find some detailed documentation of what fn* can and can't do on its own, but I was not so lucky.
It definitely supports:
the &/catchall indicator
(fn* [x & rest] (do-smth-here...))
and curiously also the overloading on arity, as in:
(fn* ([x] (smth-with-one-arg ...)
([x y] (smth-with-two-args ...))
So my question finally is, why not only define:
(fn& [& all-args] ...)
which would be absolutely minimal and could provide all the arity selection via macros (checking size of parameter list, if/case statement to direct code path, bind first few parameters to desired symbol, etc..).
Is this for performance reasons? Maybe someone even has a link to the actual standard definition of the asterisks special forms handy.
Arity selection leverages the JVM virtual method dispatch: each arity (from 0 to 20 arguments) has its own method and there's a single method for 21+-arg arities.
You may notice the applyTo method which is the generic method akin to what you propose with fn&. It's implementation is just a giant switch to select the correct specialized method.
Yes, you could do all that as macros on top of the ultra-primitive fn& that you propose, and it would certainly simplify the compiler implementation. The reason this is not done is partly for performance reasons (it would be rather slow, and the JVM already has a fast facility for dispatching based on arity), and partly "cosmetic": it means that each arity of a function is a different method of the JVM class that a function is compiled down to, which makes stacktraces nicer. This also helps the JIT "understand" our functions better, so that it can optimize accordingly.
My guess would be that this is convenience/extensibility driven. The compiler (where fn* is actually "defined"/processed) is written in java and handles the minimum needed functionality in order to bootstrap the language while fn is a macro that builds on top of it. Same with some of the other forms. Somewhere there was a statement from Rich that he could rewrite the compiler from java to clojure but does not see the benefit (correct me if wrong).
Interpreted languages are usually more high-level and therefore have features as dynamic typing (including creating new variables dynamically without declaration), the infamous eval and many many other features that make a programmer's life easier - but why can't compiled languages have these as well?
I don't mean languages like Java that run on a VM, but those that compile to binary like C(++).
I'm not going to make a list now but if you are going to ask which features I mean, please look into what PHP, Python, Ruby etc. have to offer.
Which common features of interpreted languages can't/don't/do exist in compiled languages? Why?
Whether source code is compiled - to native binaries, some kind of intermediate language (Java Bytecode/IL) - or interpreted is absolutely no trait of the language. It's just a question of the implementation.
You can actually have both compilers and interpreters for the same language like
Haskell: GHC <-> GHCI
C: gcc <-> ch
VB6: VS IDE <-> VB6 compiler
Certain language features like eval or dynamic typing may suggest a distinction between so called "dynamic languages" and static ones, but how this is run can never be the primary question.
Initially, one of the largest benefits of interpreted languages was debugging. That way you can get incredibly accurate and detailed information when looking for the reason a program isn't working. However, most compilers have become advanced enough that that is not too big of a deal any more.
The other main benefit (in my opinion anyway), is that with interpreted languages, you don't have to wait for eternity for your project to compile to test it out.
You couldn't plausibly do eval, for example, for reasons I'd have thought were pretty obvious: exactly how would you implement it? Make the runtime contain a full copy of the compiler? Every time you wanted to evaluate a string (keeping in mind that each time it could be different!) you'd save the string to a file, run the compiler on it to make a DLL/shared-lib, then load that DLL/shared-lib and call your code? You can't see why this might be a wee bit impractical? ;)
You can find this kind of thing in dynamic languages all over the place that you can't do with static code short of basically running an interpreter, in effect, behind the scenes.
Continuing on from Dario - I think you are really asking why a compiled program can't evaluate statements at runtime (e.g. eval). Here's some reasons I can think of:
The full compiler would have to be distributed with the program (or be part of the program)
For an eval function to have access to type information and symbols (such as variable names and function names) in the environment it was used the original program would have to be compiled with those symbols accessible (compiled languages usually remove these symbols at compile time).
Edit: As noted neither of these reasons make it impossible for a language/compiler to be able to evaluate code at runtime, but they are definitely things that need to be taken into consideration when developing a compiler or when designing a language.
Maybe the question is not about interpreted/compiled languages (compile is ambiguous anyway) but about languages that do/don't carry their own compiler around with them? For instance we've said C++ could do eval with a handy compiler floating around in the app, and reflection presumably is similar in some ways.
I've heard this term used a lot in the same context as logging, but I can't seem to find a clear definition of what it actually is.
Is it simply a more general class of logging/monitoring tools and activities?
Please provide sample code/scenarios when/how instrumentation should be used.
I write tools that perform instrumentation. So here is what I think it is.
DLL rewriting. This is what tools like Purify and Quantify do. A previous reply to this question said that they instrument post-compile/link. That is not correct. Purify and Quantify instrument the DLL the first time it is executed after a compile/link cycle, then cache the result so that it can be used more quickly next time around. For large applications, profiling the DLLs can be very time consuming. It is also problematic - at a company I worked at between 1998-2000 we had a large 2 million line app that would take 4 hours to instrument, and 2 of the DLLs would randomly crash during instrumentation and if either failed you would have do delete both of them, then start over.
In place instrumentation. This is similar to DLL rewriting, except that the DLL is not modified and the image on the disk remains untouched. The DLL functions are hooked appropriately to the task required when the DLL is first loaded (either during startup or after a call to LoadLibrary(Ex). You can see techniques similar to this in the Microsoft Detours library.
On-the-fly instrumentation. Similar to in-place but only actually instruments a method the first time the method is executed. This is more complex than in-place and delays the instrumentation penalty until the first time the method is encountered. Depending on what you are doing, that could be a good thing or a bad thing.
Intermediate language instrumentation. This is what is often done with Java and .Net languages (C~, VB.Net, F#, etc). The language is compiled to an intermediate language which is then executed by a virtual machine. The virtual machine provides an interface (JVMTI for Java, ICorProfiler(2) for .Net) which allows you to monitor what the virtual machine is doing. Some of these options allow you to modify the intermediate language just before it gets compiled to executable instructions.
Intermediate language instrumentation via reflection. Java and .Net both provide reflection APIs that allow the discovery of metadata about methods. Using this data you can create new methods on the fly and instrument existing methods just as with the previously mentioned Intermediate language instrumentation.
Compile time instrumentation. This technique is used at compile time to insert appropriate instructions into the application during compilation. Not often used, a profiling feature of Visual Studio provides this feature. Requires a full rebuild and link.
Source code instrumentation. This technique is used to modify source code to insert appropriate code (usually conditionally compiled so you can turn it off).
Link time instrumentation. This technique is only really useful for replacing the default memory allocators with tracing allocators. An early example of this was the Sentinel memory leak detector on Solaris/HP in the early 1990s.
The various in-place and on-the-fly instrumentation methods are fraught with danger as it is very hard to stop all threads safely and modify the code without running the risk of requiring an API call that may want to access a lock which is held by a thread you've just paused - you don't want to do that, you'll get a deadlock. You also have to check if any of the other threads are executing that method, because if they are you can't modify it.
The virtual machine based instrumentation methods are much easier to use as the virtual machine guarantees that you can safely modify the code at that point.
(EDIT - this item added later) IAT hooking instrumentation. This involved modifying the import addess table for functions linked against in other DLLs/Shared Libraries. This type of instrumentation is probably the simplest method to get working, you do not need to know how to disassemble and modify existing binaries, or do the same with virtual machine opcodes. You just patch the import table with your own function address and call the real function from your hook. Used in many commercial and open source tools.
I think I've covered them all, hope that helps.
instrumentation is usually used in dynamic code analysis.
it differs from logging as instrumentation is usually done automatically by software, while logging needs human intelligence to insert the logging code.
It's a general term for doing something to your code necessary for some further analysis.
Especially for languages like C or C++, there are tools like Purify or Quantify that profile memory usage, performance statistics, and the like. To make those profiling programs work correctly, an "instrumenting" step is necessary to insert the counters, array-boundary checks, etc that is used by the profiling programs. Note that in the Purify/Quantify scenario, the instrumentation is done automatically as a post-compilation step (actually, it's an added step to the linking process) and you don't touch your source code.
Some of that is less necessary with dynamic or VM code (i.e. profiling tools like OptimizeIt are available for Java that does a lot of what Quantify does, but no special linking is required) but that doesn't negate the concept.
A excerpt from wikipedia article
In context of computer programming,instrumentation refers to an
ability to monitor or measure the level of a product's performance, to
diagnose errors and to write trace information. Programmers implement
instrumentation in the form of code instructions that monitor specific
components in a system (for example, instructions may output logging
information to appear on screen). When an application contains
instrumentation code, it can be managed using a management tool.
Instrumentation is necessary to review the performance of the
application. Instrumentation approaches can be of two types, source
instrumentation and binary instrumentation.
Whatever Wikipedia says, there is no standard / widely agreed definition for code instrumentation in IT industry.
Please consider, instrumentation is a noun derived from instrument which has very broad meaning.
"Code" is also everything in IT, I mean - data, services, everything.
Hence, code instrumentation is a set of applications that is so wide ... not worth giving it a separate name ;-).
That's probably why this Wikipedia article is only a stub.