How do they write different language wrappers for same library? - language-agnostic

Generally a library will be released in a single language (for example C). If the library tuns out to be useful then many language wrappers for that library will be written. How exactly do they do it?
Kindly someone throw little light on this topic. If it is too language dependent pick language of your choice and explain it.

There are a few options that come to mind:
Port the original C library to the language/platform of your choice
Compile the C library into something (like a DLL) that can be invoked from other components
Put the library on the web, expose an API over HTTP and wrap that on the client
If I wanted to wrap a C library with a managed (.NET) layer, I'd compile the library into a DLL, exposing the APIs I wanted. Then, I'd use P/Invoke to call those APIs from my C# code.

Related

Extending embedded Python in C++ - Design to interact with C++ instances

There are several packages out there that help in automating the task of writing bindings between C\C++ and other languages.
In my case, I'd like to bind Python, some options for such packages are: SWIG, Boost.Python and Robin.
It seems that the straight forward process is to use these packages to create C\C++ linkable libraries (with mostly static functions) and have the higher language be extended using them.
However, my situation is that I already have a developed working system in C++ therefore plan to embed Python into it so that future development will be in Python.
It's not clear to me how, and if at all possible, to use these packages in helping to extend embedded Python in such a way that the Python code would be able to interact with the various Singleton instances already running in the system, and instantiate C++ classes and interact with them.
What I'm looking for is an insight regarding the design best fitted for this situation.
Boost.python lets you do a lot of those things right out of the box, especially if you use smart pointers. You can even inherit from C++ classes in Python, then pass instances of those back to your C++ code and have everything still work. My favorite resource on how to do various stuff is this (especially check out the "How To" section): http://wiki.python.org/moin/boost.python/ .
Boost.python is especially good if you're using smart pointers or intrusive pointers, as those translate transparently into PyObject reference counting. Also, it's very good at making factory functions look like Python constructors, which makes for very clean Python APIs.
If you're not using smart pointers, it's still possible to do all the things you want, but you have to mess with various return and lifetime policies, which can give you a headache.
To make it short: There is the modern alternative pybind11.
Long version: I also had to embed python. The C++ Python interface is small so I decided to use the C Api. That turned out to be a nightmare. Exposing classes lets you write tons of complicated boilerplate code. Boost::Python greatly avoids this by using readable interface definitions. However I found that boost lacks a sophisticated documentation and dor some things you still have to call the Python api. Further their build system seems to give people troubles. I cant tell since i use packages provided by the system. Finally I tried the boost python fork pybind11 and have to say that it is really convenient and fixes some shortcomings of boost like the necessity of the use of the Python Api, ability to use lambdas, the lack of an easy comprehensible documentation and automatic exception translation. Further it is header only and does not pull the huge boost dependency on deployment, so I can definitively recommend it.

Understand foreign function interface (FFI) and language binding

Mixing different programming languages has long been something I don't quite understand. According to this Wikipedia article, a foreign function interface (or FFI) can be done in several ways:
Requiring that guest-language functions which are to be host-language callable be specified or implemented in a particular way; often using a compatibility library of some sort.
Use of a tool to automatically "wrap" guest-language functions with appropriate glue code, which performs any necessary translation.
Use of wrapper libraries
Restricting the set of host language capabilities which can be used cross-language. For example, C++ functions called from C may not (in general) include reference parameters or throw exceptions.
My questions:
What are the differences between the
1st, 2nd and 3rd ways? It seems to
me they are all to compile the code of
the called language into some
library with object files and header
files, which are then called by the
calling language.
One source it links says,
implementing an FFI can be done in
several ways:
Requiring the called functions in the target language implement a
specific protocol.
Implementing a wrapper library that takes a given low-language
function, and "wraps" it with code to do data conversion to/from the
high-level language conventions.
Requiring functions declared native to use a subset of high-level functionality (which is compatible with the low-level language).
I was wondering if the first way in
the linked source is the same as the
first way in Wikipedia?
What does the third way in this
source mean? Does it corresponds to the 4th way in Wikipedia?
In the same source, when comparing the three ways it lists, it seems to say
the job of filling the gap between
the two languages is gradually
shifted from the called language
to the calling language. I was
wondering how to understand that? Is this shifting also true for the four ways in Wikipedia?
Are Language binding and FFI
equivalent concepts? How are they
related and differ?
a binding from a programming language
to a library or OS service is an API
providing that service in the
language.
I was wondering which way in the quotation from Wikipedia or from the source each of the following examples belongs to?
Common Object Request Broker Architecture (CORBA)
Calling C in C++, by the extern "C" declaration in C++ to
disable name mangling.
Calling C in Matlab, by MATLAB Interface to Shared Libraries, i.e., first compiling C code to shared library via general C
compiler such as gcc, and then
loading, calling a function from
and unloading the shared library
via Matlab functions
loadlibrary(), calllib() and
unloadlibrary().
Calling C in Matlab, by Creating C/C++ Language MEX-Files
Calling Matlab in C, by mcc compiler
Calling C++ in Java, by JNI, and Calling Java in C++, also by JNI
Calling C/C++ in other languages, Using SWIG
Calling C in Python, by Ctypes module.
Cython
Calling R in Python, by RPy
Programming Language Bindings to OpenGL from various languages, such as Python, Fortran and Java
Bindings for a C library, such as Cairo, from various languages,
such as C++, Python, Java, Common Lisp
May be a specific example will help. Let us take the host language as Python and the guest language as C. This means that Python will be calling C functions.
The first option is to write the C library in a particular way. In the case of Python the standard way would be to have the C function written with a first parameter of Py_Object * among other conditions. For example (from here):
static PyObject *
spam_system(PyObject *self, PyObject *args)
{
const char *command;
int sts;
if (!PyArg_ParseTuple(args, "s", &command))
return NULL;
sts = system(command);
return Py_BuildValue("i", sts);
}
is a C function callable from Python. For this to work the library has to be written with Python compatibility in mind.
If you want to use an already existing C library, you need another option. One is to have a tool that generates wraps this existing library in a format suitable for consumption by the host language. Take Swig which can be used to tie many languages. Given an existing C library you can use swig to effectively generate C code that calls your existing library while conforming to Python conventions. See the example for building a Python module.
Another option to us an already existing C library is to call it from a Python library that effectively wraps the calls at run time, like ctypes. While in option 2 compilation was necessary, it is not this time.
Another thing is that there are a lot of options (which do overlap) for calling functions in one language from another language. There are FFIs (equivalent to language bindings as far as I know) which usually refer to calling between multiple languages in the same process (as part of the same executable, so to speak), and there are interprocess communication means (local and network). Things like CORBA and Web Services (SOAP or REST) and and COM+ and remote procedure calls in general are of the second category and are not seen as FFI. In fact, they mostly don't prescribe any particular language to be used at either side of the communication. I would loosely put them as IPC (interprocess communication) options, though this is simplification in the case of network based APi like CORBA and SOAP.
Having a go at your list, I would venture the following opinions:
Common Object Request Broker Architecture: IPC, not FFI
Calling C in C++, by the extern "C" declaration in C++ to disable name mangling. ****
Calling C in Matlab, by MATLAB Interface to Shared Libraries Option 3 (ctypes-like)
Calling C in Matlab, by Creating C/C++ Language MEX-Files Option 2 (swig-like)
Calling Matlab in C, by mcc compiler Option 2 (swig-like)
Calling C++ in Java, by JNI, and Calling Java in C++ by JNI Option 3 (ctypes-like)
Calling C/C++ in other languages, Using SWIG Option 2 (swig)
Calling C in Python, by Ctypes Option 3 (ctypes)
Cython Option 2 (swig-like)
Calling R in Python, by RPy Option 3 (ctypes-like) in part, and partly about data exchange (not FFI)
The next two are not foreign function interfaces at all, as the term is used. FFi is about the interaction between two programming languages and should be capable of making any library (with suitable restrictions) from one language available to the other. A particular library being accessible from one language does not an FFI make.
Programming Language Bindings to OpenGL from various languages
Bindings for a C library from various languages

Loading Elixir/SQLAlchemy models in .NET?

A new requirement has come down from the top: implement 'proprietary business tech' with the awesome, resilient Elixir database I have set up. I've tried a lot of different things, such as creating an implib from the provided interop DLL (which apparently doesn't work like COM dlls) which didn't work at all. CPython doesn't like the MFC stuff either, so all attempts to create a Python lib have failed (using C anyway, not sure you can create a python library from .NET directly).
The only saving grace is the developer saw fit to provide VBA, .NET and MFC Interop C++ hooks into his library, so there are "some" choices, though they all ultimately lead back to the same framework. What would be the best method to:
A) Keep my model definitions in one place, in one language (Python/Elixir/SQLAlchemy)
B) Have this new .NET access the models without resorting to brittle, hard-coded SQL.
Any and all suggestions are welcome.
After a day or so of deliberation, I'm attempting to load the new business module in IronPython. Although I don't really want to introduce to python interpreters into my environment, I think that this will be the glue I need to get this done efficiently.

Framework vs. Toolkit vs. Library [duplicate]

This question already has answers here:
What is the difference between a framework and a library? [closed]
(22 answers)
Closed 6 years ago.
What is the difference between a Framework, a Toolkit and a Library?
The most important difference, and in fact the defining difference between a library and a framework is Inversion of Control.
What does this mean? Well, it means that when you call a library, you are in control. But with a framework, the control is inverted: the framework calls you. (This is called the Hollywood Principle: Don't call Us, We'll call You.) This is pretty much the definition of a framework. If it doesn't have Inversion of Control, it's not a framework. (I'm looking at you, .NET!)
Basically, all the control flow is already in the framework, and there's just a bunch of predefined white spots that you can fill out with your code.
A library on the other hand is a collection of functionality that you can call.
I don't know if the term toolkit is really well defined. Just the word "kit" seems to suggest some kind of modularity, i.e. a set of independent libraries that you can pick and choose from. What, then, makes a toolkit different from just a bunch of independent libraries? Integration: if you just have a bunch of independent libraries, there is no guarantee that they will work well together, whereas the libraries in a toolkit have been designed to work well together – you just don't have to use all of them.
But that's really just my interpretation of the term. Unlike library and framework, which are well-defined, I don't think that there is a widely accepted definition of toolkit.
Martin Fowler discusses the difference between a library and a framework in his article on Inversion of Control:
Inversion of Control is a key part of
what makes a framework different to a
library. A library is essentially a
set of functions that you can call,
these days usually organized into
classes. Each call does some work and
returns control to the client.
A framework embodies some abstract
design, with more behavior built in.
In order to use it you need to insert
your behavior into various places in
the framework either by subclassing or
by plugging in your own classes. The
framework's code then calls your code
at these points.
To summarize: your code calls a library but a framework calls your code.
Diagram
If you are a more visual learner, here is a diagram that makes it clearer:
(Credits: http://tom.lokhorst.eu/2010/09/why-libraries-are-better-than-frameworks)
The answer provided by Barrass is probably the most complete. However, the explanation could easily be stated more clearly. Most people miss the fact that these are all nested concepts. So let me lay it out for you.
When writing code:
eventually you discover sections of code that you're repeating in your program, so you refactor those into Functions/Methods.
eventually, after having written a few programs, you find yourself copying functions you already made into new programs. To save yourself time you bundle those functions into Libraries.
eventually you find yourself creating the same kind of user interfaces every time you make use of certain libraries. So you refactor your work and create a Toolkit that allows you to create your UIs more easily from generic method calls.
eventually, you've written so many apps that use the same toolkits and libraries that you create a Framework that has a generic version of this boilerplate code already provided so all you need to do is design the look of the UI and handle the events that result from user interaction.
Generally speaking, this completely explains the differences between the terms.
Introduction
There are various terms relating to collections of related code, which have both historical (pre-1994/5 for the purposes of this answer) and current implications, and the reader should be aware of both, particularly when reading classic texts on computing/programming from the historic era.
Library
Both historically, and currently, a library is a collection of code relating to a specific task, or set of closely related tasks which operate at roughly the same level of abstraction. It generally lacks any purpose or intent of its own, and is intended to be used by (consumed) and integrated with client code to assist client code in executing its tasks.
Toolkit
Historically, a toolkit is a more focused library, with a defined and specific purpose. Currently, this term has fallen out of favour, and is used almost exclusively (to this author's knowledge) for graphical widgets, and GUI components in the current era. A toolkit will most often operate at a higher layer of abstraction than a library, and will often consume and use libraries itself. Unlike libraries, toolkit code will often be used to execute the task of the client code, such as building a window, resizing a window, etc. The lower levels of abstraction within a toolkit are either fixed, or can themselves be operated on by client code in a proscribed manner. (Think Window style, which can either be fixed, or which could be altered in advance by client code.)
Framework
Historically, a framework was a suite of inter-related libraries and modules which were separated into either 'General' or 'Specific' categories. General frameworks were intended to offer a comprehensive and integrated platform for building applications by offering general functionality, such as cross platform memory management, multi-threading abstractions, dynamic structures (and generic structures in general). Historical general frameworks (Without dependency injection, see below) have almost universally been superseded by polymorphic templated (parameterised) packaged language offerings in OO languages, such as the STL for C++, or in packaged libraries for non-OO languages (guaranteed Solaris C headers). General frameworks operated at differing layers of abstraction, but universally low level, and like libraries relied on the client code carrying out it's specific tasks with their assistance.
'Specific' frameworks were historically developed for single (but often sprawling) tasks, such as "Command and Control" systems for industrial systems, and early networking stacks, and operated at a high level of abstraction and like toolkits were used to carry out execution of the client codes tasks.
Currently, the definition of a framework has become more focused and taken on the "Inversion of Control" principle as mentioned elsewhere as a guiding principle, so program flow, as well as execution is carried out by the framework. Frameworks are still however targeted either towards a specific output; an application for a specific OS for example (MFC for MS Windows for example), or for more general purpose work (Spring framework for example).
SDK: "Software Development Kit"
An SDK is a collection of tools to assist the programmer to create and deploy code/content which is very specifically targeted to either run on a very particular platform or in a very particular manner. An SDK can consist of simply a set of libraries which must be used in a specific way only by the client code and which can be compiled as normal, up to a set of binary tools which create or adapt binary assets to produce its (the SDK's) output.
Engine
An Engine (In code collection terms) is a binary which will run bespoke content or process input data in some way. Game and Graphics engines are perhaps the most prevalent users of this term, and are almost universally used with an SDK to target the engine itself, such as the UDK (Unreal Development Kit) but other engines also exist, such as Search engines and RDBMS engines.
An engine will often, but not always, allow only a few of its internals to be accessible to its clients. Most often to either target a different architecture, change the presentation of the output of the engine, or for tuning purposes. Open Source Engines are by definition open to clients to change and alter as required, and some propriety engines are fixed completely. The most often used engines in the world however, are almost certainly JavaScript Engines. Embedded into every browser everywhere, there are a whole host of JavaScript engines which will take JavaScript as an input, process it, and then output to render.
API: "Application Programming Interface"
The final term I am answering is a personal bugbear of mine: API, was historically used to describe the external interface of an application or environment which, itself was capable of running independently, or at least of carrying out its tasks without any necessary client intervention after initial execution. Applications such as Databases, Word Processors and Windows systems would expose a fixed set of internal hooks or objects to the external interface which a client could then call/modify/use, etc to carry out capabilities which the original application could carry out. API's varied between how much functionality was available through the API, and also, how much of the core application was (re)used by the client code. (For example, a word processing API may require the full application to be background loaded when each instance of the client code runs, or perhaps just one of its linked libraries; whereas a running windowing system would create internal objects to be managed by itself and pass back handles to the client code to be utilised instead.
Currently, the term API has a much broader range, and is often used to describe almost every other term within this answer. Indeed, the most common definition applied to this term is that an API offers up a contracted external interface to another piece of software (Client code to the API). In practice this means that an API is language dependent, and has a concrete implementation which is provided by one of the above code collections, such as a library, toolkit, or framework.
To look at a specific area, protocols, for example, an API is different to a protocol which is a more generic term representing a set of rules, however an individual implementation of a specific protocol/protocol suite that exposes an external interface to other software would most often be called an API.
Remark
As noted above, historic and current definitions of the above terms have shifted, and this can be seen to be down to advances in scientific understanding of the underlying computing principles and paradigms, and also down to the emergence of particular patterns of software. In particular, the GUI and Windowing systems of the early nineties helped to define many of these terms, but since the effective hybridisation of OS Kernel and Windowing system for mass consumer operating systems (bar perhaps Linux), and the mass adoption of dependency injection/inversion of control as a mechanism to consume libraries and frameworks, these terms have had to change their respective meanings.
P.S. (A year later)
After thinking carefully about this subject for over a year I reject the IoC principle as the defining difference between a framework and a library. There ARE a large number of popular authors who say that it is, but there are an almost equal number of people who say that it isn't. There are simply too many 'Frameworks' out there which DO NOT use IoC to say that it is the defining principle. A search for embedded or micro controller frameworks reveals a whole plethora which do NOT use IoC and I now believe that the .NET language and CLR is an acceptable descendant of the "general" framework. To say that IoC is the defining characteristic is simply too rigid for me to accept I'm afraid, and rejects out of hand anything putting itself forward as a framework which matches the historical representation as mentioned above.
For details of non-IoC frameworks, see, as mentioned above, many embedded and micro frameworks, as well as any historical framework in a language that does not provide callback through the language (OK. Callbacks can be hacked for any device with a modern register system, but not by the average programmer), and obviously, the .NET framework.
A library is simply a collection of methods/functions wrapped up into a package that can be imported into a code project and re-used.
A framework is a robust library or collection of libraries that provides a "foundation" for your code. A framework follows the Inversion of Control pattern. For example, the .NET framework is a large collection of cohesive libraries in which you build your application on top of. You can argue there isn't a big difference between a framework and a library, but when people say "framework" it typically implies a larger, more robust suite of libraries which will play an integral part of an application.
I think of a toolkit the same way I think of an SDK. It comes with documentation, examples, libraries, wrappers, etc. Again, you can say this is the same as a framework and you would probably be right to do so.
They can almost all be used interchangeably.
very, very similar, a framework is usually a bit more developed and complete then a library, and a toolkit can simply be a collection of similar librarys and frameworks.
a really good question that is maybe even the slightest bit subjective in nature, but I believe that is about the best answer I could give.
Library
I think it's unanimous that a library is code already coded that you can use so as not to have to code it again. The code must be organized in a way that allows you to look up the functionality you want and use it from your own code.
Most programming languages come with standard libraries, especially some code that implements some kind of collection. This is always for the convenience that you don't have to code these things yourself. Similarly, most programming languages have construct to allow you to look up functionality from libraries, with things like dynamic linking, namespaces, etc.
So code that finds itself often needed to be re-used is great code to be put inside a library.
Toolkit
A set of tools used for a particular purpose. This is unanimous. The question is, what is considered a tool and what isn't. I'd say there's no fixed definition, it depends on the context of the thing calling itself a toolkit. Example of tools could be libraries, widgets, scripts, programs, editors, documentation, servers, debuggers, etc.
Another thing to note is the "particular purpose". This is always true, but the scope of the purpose can easily change based on who made the toolkit. So it can easily be a programmer's toolkit, or it can be a string parsing toolkit. One is so broad, it could have tool touching everything programming related, while the other is more precise.
SDKs are generally toolkits, in that they try and bundle a set of tools (often of multiple kind) into a single package.
I think the common thread is that a tool does something for you, either completely, or it helps you do it. And a toolkit is simply a set of tools which all perform or help you perform a particular set of activities.
Framework
Frameworks aren't quite as unanimously defined. It seems to be a bit of a blanket term for anything that can frame your code. Which would mean: any structure that underlies or supports your code.
This implies that you build your code against a framework, whereas you build a library against your code.
But, it seems that sometimes the word framework is used in the same sense as toolkit or even library. The .Net Framework is mostly a toolkit, because it's composed of the FCL which is a library, and the CLR, which is a virtual machine. So you would consider it a toolkit to C# development on Windows. Mono being a toolkit for C# development on Linux. Yet they called it a framework. It makes sense to think of it this way too, since it kinds of frame your code, but a frame should more support and hold things together, then do any kind of work, so my opinion is this is not the way you should use the word.
And I think the industry is trying to move into having framework mean an already written program with missing pieces that you must provide or customize. Which I think is a good thing, since toolkit and library are great precise terms for other usages of "framework".
Framework: installed on you machine and allowing you to interact with it. without the framework you can't send programming commands to your machine
Library: aims to solve a certain problem (or several problems related to the same category)
Toolkit: a collection of many pieces of code that can solve multiple problems on multiple issues (just like a toolbox)
It's a little bit subjective I think. The toolkit is the easiest. It's just a bunch of methods, classes that can be use.
The library vs the framework question I make difference by the way to use them. I read somewhere the perfect answer a long time ago. The framework calls your code, but on the other hand your code calls the library.
In relation with the correct answer from Mittag:
a simple example. Let's say you implement the ISerializable interface (.Net) in one of your classes. You make use of the framework qualities of .Net then, rather than it's library qualities. You fill in the "white spots" (as mittag said) and you have the skeleton completed. You must know in advance how the framework is going to "react" with your code. Actually .net IS a framework, and here is where i disagree with the view of Mittag.
The full, complete answer to your question is given very lucidly in Chapter 19 (the whole chapter devoted to just this theme) of this book, which is a very good book by the way (not at all "just for Smalltalk").
Others have noted that .net may be both a framework and a library and a toolkit depending on which part you use but perhaps an example helps. Entity Framework for dealing with databases is a part of .net that does use the inversion of control pattern. You let it know your models it figures out what to do with them. As a programmer it requires you to understand "the mind of the framework", or more realistically the mind of the designer and what they are going to do with your inputs. datareader and related calls, on the other hand, are simply a tool to go get or put data to and from table/view and make it available to you. It would never understand how to take a parent child relationship and translate it from object to relational, you'd use multiple tools to do that. But you would have much more control on how that data was stored, when, transactions, etc.

Understanding run time code interpretation and execution

I'm creating a game in XNA and was thinking of creating my own scripting language (extremely simple mind you). I know there's better ways to go about this (and that I'm reinventing the wheel), but I want the learning experience more than to be productive and fast.
When confronted with code at run time, from what I understand, the usual approach is to parse into a machine code or byte code or something else that is actually executable and then execute that, right? But, for instance, when Chrome first came out they said their JavaScript engine was fast because it compiles the JavaScript into machine code. This implies other engines weren't compiling into machine code.
I'd prefer not compiling to a lower language, so are there any known modern techniques for parsing and executing code without compiling to low level? Perhaps something like parsing the code into some sort of tree, branching through the tree, and comparing each symbol and calling some function that handles that symbol? (Wild guessing and stabbing in the dark)
I personally wouldn't roll your own parser ( turning the input into tokens ) or lexer ( checking the input tokens for your language grammar ). Take a look at ANTLR for parsing/lexing - it's a great framework and has full source code if you want to dig into the guts of it.
For executing code that you've parsed, I'd look at running a simple virtual machine or even better look at llvm which is an open-source(ish) attempt to standardise the virtual machine byte code format and provide nice features like JITing ( turning your script compiled byte code into assembly ).
I wouldn't discourage you from the more advanced options that you machine such as native machine code execution but bear in mind that this is a very specialist area and gets real complex, real fast!
Earlz pointed out that my reply might seem to imply 'don't bother doing this yourself. Re-reading my post it does sound a bit that way. The reason I mentioned ANTLR and LLVM is they both have heaps of source code and tutorials so I feel this is a good reference source. Take it as a base and play
You can try this framework for building languages (it works well with XNA):
http://www.meta-alternative.net/mbase.html
There are some tutorials:
http://www.meta-alternative.net/calc.pdf
http://www.meta-alternative.net/pfront.pdf
Python is great as a scripting language. I would recommend you make a C# binding for its C API and use that. Embedding Python is easy. Your application can define functions, types/classes and variables inside modules which the Python interpreter can access. The application can also call functions in Python scripts and get a result back. These two features combined gives you a two-way communication scheme.
Basically, you get the Python syntax and semantics for free. What you would need to implement is the API your application exposes to Python. An example could be access to game logic functions and render functions. Python scripts would then define functions which calls these, and the host application would invoke the Python functions (with parameters) to get work done.
EDIT: Seems like IronPython can save you even more work. It's a C# implementation of CPython, and has its own embedding API: http://www.ironpython.net/