What is a Shim? - terminology

What's the definition of a Shim?

Simple Explanation via Cartoon
Summary
A shim is some code that takes care of what's asked (by 'interception'), without anyone being any wiser about it.
Example of a Shim
An example of a shim would be rbenv (a ruby tool). Calls to ruby commands are "shimmed". i.e. when you run bundle install, rbenv intercepts that message, and reroutes it according to the specific version of Ruby you are running. If that doesn't make sense try this example, or just think of the fairy god mother intercepting messages and delivering apposite outcomes.
That's it!
Important Clarifications on this example
Note: Like most analogies, this is not perfect: usually Ralph will get EXACTLY what he asked for - but the mechanics of HOW it was obtained is something Ralph doesn't care about. If Ralph asks for dog food, a good shim will deliver dog food.
I wanted to avoid semantic arguments, and complexity e.g. adapter gang of four design patterns, facade, proxy patterns - not that great when you're trying to explain a concept. Introducing code? Pedagogically risky. Wikipedia-like explanation? Boooring, too complex, and time consuming: so I had to deliberately simplify to a cartoon, so you can easily understand in a "fun" way, in 30 seconds, is memorable so you can move on. This approach is not for everyone: if you want a precise definition consider the Wikipedia entry on shims.

The term "shim" as defined in Wikipedia would technically be classified, based on its definition, as a "Structural" design pattern. The many types of “Structural” design patterns are quite clearly described in the (some would say defacto) object oriented software design patterns reference "Design Patterns, Elements of Reusable Object-Oriented Software" better known as the "Gang of Four".
The "Gang of Four" text outlines at least 3 well established patterns known as, "Proxy", "Adapter" and "Facade" which all provide “shim” type functionality. In most fields it’s often times the use and or miss use of different acronyms for the same root concept that causes people confusion. Using the word “shim” to describe the more specific “Structural” design patterns "Proxy", "Adapter" and "Facade" certainly is a clear example of this type of situation. A "shim" is simply a more general term for the more specific types of "Structural" patterns "Proxy", "Adapter", "Facade" and possibly others.

According to Microsoft's article "Demystifying Shims":
It’s a metaphor based on the English language word shim, which is an
engineering term used to describe a piece of wood or metal that is
inserted between two objects to make them fit together better. In
computer programming, a shim is a small library which transparently
intercepts an API, changes the parameters passed, handles the
operation itself, or redirects the operation elsewhere. Shims can also
be used for running programs on different software platforms than they
were developed for.
So a shim is a generic term for any library of code that acts as a middleman and partially or completely changes the behavior or operation of a program. Like a true middleman, it can affect the data passed to that program, or affect the data returned from that program.
The Windows API is an example:
The application is generally unaware that the request is going to a
shim DLL instead of to Windows itself, and Windows is unaware that the
request is coming from a source other than the application (because
the shim DLL is just another DLL inside the application’s process).
So the two programs that make the "bread" of the "shim sandwich" should not be able to differentiate between talking to their counterpart program and talking to the shim.
What are some pros and cons of using shims?
Again, from the article:
You can fix applications without access to the source code, or without
changing them at all. You incur a minimal amount of additional
management overhead... and you can fix a
reasonable number of applications this way. The downside is support as
most vendors don’t support shimmed applications. You can’t fix every
application using shims. Most people typically consider shims for
applications where the vendor is out of business, the software isn’t
strategic enough to necessitate support, or they just want to buy some
time.

As for origins of the word, quoth Apple's Dictionary widget
noun
a washer or thin strip of material used to align parts,
make them fit, or reduce wear.
verb ( shimmed, shimming) [ trans. ]
wedge (something) or fill up (a space) with a shim.
ORIGIN early 18th cent.: of unknown origin
This seems to fit quite well with how web designers use the term.

Shims are used in .net 4.5 Microsoft Fakes framework to isolate your application from other assemblies for unit testing. Shims divert calls to specific methods to code that you write as part of your test

As we could see in many responses here, a shim is a sort of adapter that provides functionality at API level which was not necessarily part of that API. This thread has a lot of good and complete responses, so I'm not expanding the definition further.
However, I think I can add a good example, which is the Javascript ES5 Shim (https://github.com/es-shims/es5-shim):
Javascript has evolved a lot during the last few years, and among many other changes to the language specification, a lot of new methods have been added to its core objects.
For example, in the ES2015 specification (aka ES5), the method find has been added to the Array prototype. So let's say you are running your code using a JavasScript engine prior to this specification (ex: Node 0.12) which doesn't offer that method yet. By loading the ES5 shim, these new methods will be added to the Array prototype, allowing you to make use of them even if you are not running on a newer JavaScript specification.
You might ask: why would someone do that instead of upgrading the environment to a newer version (let's say Node 8)?
There is a lot of real cases scenarios where that approach makes sense. One good example:
Let's say you have a legacy system that is running in an old environment, and you need to use such new methods to implement/fix a functionality. The upgrade of your environment still a work in progress because there are compatibility issues that require a lot of code changes and tests (a critical component).
In this example, you could try to craft your own version of such functionality, but that would make your code harder to read, more complex, can introduce new bugs and will require tons of additional tests just to cover a functionality that you know it will be available in the next release.
Instead, you can use this shim and make use of these new methods, taking advantage of the fact that this fix/functionality will be compatible after the upgrade, because you are already using the methods known to be available in the next specification. And there is a bonus reason: since these methods are native to the next language specification, there is a good chance that they will run faster than any implementation that you could have done if you tried to make your own version.
Another real scenario where such approach is welcome is at browser level. Let's say you need to support old browser and want to take advantage of these newer features. Javascript is a language that allows you to add/modify methods in its core objects (like adding methods to Array prototype), and those shim libraries are smart enough to add such methods only if the current implementation is lacking of them.
PS:
1) You will see the term "Polyfill" related to these Javascript shims. Polyfill is a more specialized type of shim that is used to provide forward compatibility in different browser level specifications. By the way, my example above refers exactly to such example.
2) Shims are not limited to this example (adding functionality that will be available in a future release). There are different use cases that would be considered to be a shim as well.
3) If you are curious about how this specific polyfill is implemented, you can open Javascript Array.find specs and scroll to the end of the page where you will find a canonical implementation for this method.

SHIM is another level of security check which is done for all the services, to protect upstream systems. SHIM Server validates every incoming request, with Headers User credentials, against the user credentials, which are passed in the request(SOAP / RESTFUL).

Related

When, why and how to use wrappers?

I'm talking about wrappers for third-party libraries. Until recently I was trying to provide a general enough wrapper so I could easily switch libraries if needed. This however proved to be nearly impossible since libraries can vary greatly even in terms of how basic concepts are handled.
So the question came to me why one should use wrappers at all. (In the past I have been encouraged by experienced coders to write wrappers for 3rd-party libs.) I came to the following conclusions; please tell me if they are wrong or if you have anything to add.
If the library isn't widely used in the application (e.g. used by only one or two classes), don't write a wrapper at all, just use it directly. (Especially if it's a portable lib.)
When you do write wrappers don't think you can make one-size-fit-all wrapper. Write something appropriate for the strengths of the lib.
... But in some cases you can still generalize the wrapper enough so that it'll be somewhat easier to switch libraries. (E.g.: most graphics libraries use images and fonts.)
Wrappers are useful for when the library offers more functionality than you need. You can hide the unneeded functionality in the wrapper.
In the case of C libs (if you're using C++), you can also write a wrapper to help you with automatic memory management.
What do you think are the (dis)advantages of using wrappers, and how should they be used properly?
I think you've hit the nail on the head, wrappers just to allow something to potentially be swapped out is a bad idea. The classic example is a database and who has actually ever had to switch from SQL to Oracle (I know people have, but how often and did having a wrapper really help?).
In my experience a wrapper only helps if it is hiding 2+ calls to the 3rd party component or api's into a single call that means something to the calling code (basically a facade pattern) or if it is wrapping the code and adding value / type conversion for the caller (an adapter pattern).
So the wrapper must provide a benefit here and now to the consumer, not a potential future benefit (to the system coder) that may never be needed.
Wrappers are powerfull if you want to test in isolation. For example my development system has no connection to my customers activedirectory that holds usernames and roles. so i have a UserInfoWrapper-Interface with two implementations: one that uses activedirectory and one with fake userdata used for development.
"All problems in computer science can be solved by another level of indirection" by Butler Lampson
There is a cost involved in abstracting third party libraries by creating a wrapper. You need to decide whether the cost is worth it or not. For e.g. it is extremely difficult (or at least involves significant development cost) to create wrapper on UI toolkits or libraries. On the contrary it is relatively easy to create wrappers for third party logging libraries.
Wrapper can also be used to provide domain specific and a simplified API on top of third party libary. Facade pattern could be of help (as Paul Hadfield has mentioned above).

Framework vs. Toolkit vs. Library [duplicate]

This question already has answers here:
What is the difference between a framework and a library? [closed]
(22 answers)
Closed 6 years ago.
What is the difference between a Framework, a Toolkit and a Library?
The most important difference, and in fact the defining difference between a library and a framework is Inversion of Control.
What does this mean? Well, it means that when you call a library, you are in control. But with a framework, the control is inverted: the framework calls you. (This is called the Hollywood Principle: Don't call Us, We'll call You.) This is pretty much the definition of a framework. If it doesn't have Inversion of Control, it's not a framework. (I'm looking at you, .NET!)
Basically, all the control flow is already in the framework, and there's just a bunch of predefined white spots that you can fill out with your code.
A library on the other hand is a collection of functionality that you can call.
I don't know if the term toolkit is really well defined. Just the word "kit" seems to suggest some kind of modularity, i.e. a set of independent libraries that you can pick and choose from. What, then, makes a toolkit different from just a bunch of independent libraries? Integration: if you just have a bunch of independent libraries, there is no guarantee that they will work well together, whereas the libraries in a toolkit have been designed to work well together – you just don't have to use all of them.
But that's really just my interpretation of the term. Unlike library and framework, which are well-defined, I don't think that there is a widely accepted definition of toolkit.
Martin Fowler discusses the difference between a library and a framework in his article on Inversion of Control:
Inversion of Control is a key part of
what makes a framework different to a
library. A library is essentially a
set of functions that you can call,
these days usually organized into
classes. Each call does some work and
returns control to the client.
A framework embodies some abstract
design, with more behavior built in.
In order to use it you need to insert
your behavior into various places in
the framework either by subclassing or
by plugging in your own classes. The
framework's code then calls your code
at these points.
To summarize: your code calls a library but a framework calls your code.
Diagram
If you are a more visual learner, here is a diagram that makes it clearer:
(Credits: http://tom.lokhorst.eu/2010/09/why-libraries-are-better-than-frameworks)
The answer provided by Barrass is probably the most complete. However, the explanation could easily be stated more clearly. Most people miss the fact that these are all nested concepts. So let me lay it out for you.
When writing code:
eventually you discover sections of code that you're repeating in your program, so you refactor those into Functions/Methods.
eventually, after having written a few programs, you find yourself copying functions you already made into new programs. To save yourself time you bundle those functions into Libraries.
eventually you find yourself creating the same kind of user interfaces every time you make use of certain libraries. So you refactor your work and create a Toolkit that allows you to create your UIs more easily from generic method calls.
eventually, you've written so many apps that use the same toolkits and libraries that you create a Framework that has a generic version of this boilerplate code already provided so all you need to do is design the look of the UI and handle the events that result from user interaction.
Generally speaking, this completely explains the differences between the terms.
Introduction
There are various terms relating to collections of related code, which have both historical (pre-1994/5 for the purposes of this answer) and current implications, and the reader should be aware of both, particularly when reading classic texts on computing/programming from the historic era.
Library
Both historically, and currently, a library is a collection of code relating to a specific task, or set of closely related tasks which operate at roughly the same level of abstraction. It generally lacks any purpose or intent of its own, and is intended to be used by (consumed) and integrated with client code to assist client code in executing its tasks.
Toolkit
Historically, a toolkit is a more focused library, with a defined and specific purpose. Currently, this term has fallen out of favour, and is used almost exclusively (to this author's knowledge) for graphical widgets, and GUI components in the current era. A toolkit will most often operate at a higher layer of abstraction than a library, and will often consume and use libraries itself. Unlike libraries, toolkit code will often be used to execute the task of the client code, such as building a window, resizing a window, etc. The lower levels of abstraction within a toolkit are either fixed, or can themselves be operated on by client code in a proscribed manner. (Think Window style, which can either be fixed, or which could be altered in advance by client code.)
Framework
Historically, a framework was a suite of inter-related libraries and modules which were separated into either 'General' or 'Specific' categories. General frameworks were intended to offer a comprehensive and integrated platform for building applications by offering general functionality, such as cross platform memory management, multi-threading abstractions, dynamic structures (and generic structures in general). Historical general frameworks (Without dependency injection, see below) have almost universally been superseded by polymorphic templated (parameterised) packaged language offerings in OO languages, such as the STL for C++, or in packaged libraries for non-OO languages (guaranteed Solaris C headers). General frameworks operated at differing layers of abstraction, but universally low level, and like libraries relied on the client code carrying out it's specific tasks with their assistance.
'Specific' frameworks were historically developed for single (but often sprawling) tasks, such as "Command and Control" systems for industrial systems, and early networking stacks, and operated at a high level of abstraction and like toolkits were used to carry out execution of the client codes tasks.
Currently, the definition of a framework has become more focused and taken on the "Inversion of Control" principle as mentioned elsewhere as a guiding principle, so program flow, as well as execution is carried out by the framework. Frameworks are still however targeted either towards a specific output; an application for a specific OS for example (MFC for MS Windows for example), or for more general purpose work (Spring framework for example).
SDK: "Software Development Kit"
An SDK is a collection of tools to assist the programmer to create and deploy code/content which is very specifically targeted to either run on a very particular platform or in a very particular manner. An SDK can consist of simply a set of libraries which must be used in a specific way only by the client code and which can be compiled as normal, up to a set of binary tools which create or adapt binary assets to produce its (the SDK's) output.
Engine
An Engine (In code collection terms) is a binary which will run bespoke content or process input data in some way. Game and Graphics engines are perhaps the most prevalent users of this term, and are almost universally used with an SDK to target the engine itself, such as the UDK (Unreal Development Kit) but other engines also exist, such as Search engines and RDBMS engines.
An engine will often, but not always, allow only a few of its internals to be accessible to its clients. Most often to either target a different architecture, change the presentation of the output of the engine, or for tuning purposes. Open Source Engines are by definition open to clients to change and alter as required, and some propriety engines are fixed completely. The most often used engines in the world however, are almost certainly JavaScript Engines. Embedded into every browser everywhere, there are a whole host of JavaScript engines which will take JavaScript as an input, process it, and then output to render.
API: "Application Programming Interface"
The final term I am answering is a personal bugbear of mine: API, was historically used to describe the external interface of an application or environment which, itself was capable of running independently, or at least of carrying out its tasks without any necessary client intervention after initial execution. Applications such as Databases, Word Processors and Windows systems would expose a fixed set of internal hooks or objects to the external interface which a client could then call/modify/use, etc to carry out capabilities which the original application could carry out. API's varied between how much functionality was available through the API, and also, how much of the core application was (re)used by the client code. (For example, a word processing API may require the full application to be background loaded when each instance of the client code runs, or perhaps just one of its linked libraries; whereas a running windowing system would create internal objects to be managed by itself and pass back handles to the client code to be utilised instead.
Currently, the term API has a much broader range, and is often used to describe almost every other term within this answer. Indeed, the most common definition applied to this term is that an API offers up a contracted external interface to another piece of software (Client code to the API). In practice this means that an API is language dependent, and has a concrete implementation which is provided by one of the above code collections, such as a library, toolkit, or framework.
To look at a specific area, protocols, for example, an API is different to a protocol which is a more generic term representing a set of rules, however an individual implementation of a specific protocol/protocol suite that exposes an external interface to other software would most often be called an API.
Remark
As noted above, historic and current definitions of the above terms have shifted, and this can be seen to be down to advances in scientific understanding of the underlying computing principles and paradigms, and also down to the emergence of particular patterns of software. In particular, the GUI and Windowing systems of the early nineties helped to define many of these terms, but since the effective hybridisation of OS Kernel and Windowing system for mass consumer operating systems (bar perhaps Linux), and the mass adoption of dependency injection/inversion of control as a mechanism to consume libraries and frameworks, these terms have had to change their respective meanings.
P.S. (A year later)
After thinking carefully about this subject for over a year I reject the IoC principle as the defining difference between a framework and a library. There ARE a large number of popular authors who say that it is, but there are an almost equal number of people who say that it isn't. There are simply too many 'Frameworks' out there which DO NOT use IoC to say that it is the defining principle. A search for embedded or micro controller frameworks reveals a whole plethora which do NOT use IoC and I now believe that the .NET language and CLR is an acceptable descendant of the "general" framework. To say that IoC is the defining characteristic is simply too rigid for me to accept I'm afraid, and rejects out of hand anything putting itself forward as a framework which matches the historical representation as mentioned above.
For details of non-IoC frameworks, see, as mentioned above, many embedded and micro frameworks, as well as any historical framework in a language that does not provide callback through the language (OK. Callbacks can be hacked for any device with a modern register system, but not by the average programmer), and obviously, the .NET framework.
A library is simply a collection of methods/functions wrapped up into a package that can be imported into a code project and re-used.
A framework is a robust library or collection of libraries that provides a "foundation" for your code. A framework follows the Inversion of Control pattern. For example, the .NET framework is a large collection of cohesive libraries in which you build your application on top of. You can argue there isn't a big difference between a framework and a library, but when people say "framework" it typically implies a larger, more robust suite of libraries which will play an integral part of an application.
I think of a toolkit the same way I think of an SDK. It comes with documentation, examples, libraries, wrappers, etc. Again, you can say this is the same as a framework and you would probably be right to do so.
They can almost all be used interchangeably.
very, very similar, a framework is usually a bit more developed and complete then a library, and a toolkit can simply be a collection of similar librarys and frameworks.
a really good question that is maybe even the slightest bit subjective in nature, but I believe that is about the best answer I could give.
Library
I think it's unanimous that a library is code already coded that you can use so as not to have to code it again. The code must be organized in a way that allows you to look up the functionality you want and use it from your own code.
Most programming languages come with standard libraries, especially some code that implements some kind of collection. This is always for the convenience that you don't have to code these things yourself. Similarly, most programming languages have construct to allow you to look up functionality from libraries, with things like dynamic linking, namespaces, etc.
So code that finds itself often needed to be re-used is great code to be put inside a library.
Toolkit
A set of tools used for a particular purpose. This is unanimous. The question is, what is considered a tool and what isn't. I'd say there's no fixed definition, it depends on the context of the thing calling itself a toolkit. Example of tools could be libraries, widgets, scripts, programs, editors, documentation, servers, debuggers, etc.
Another thing to note is the "particular purpose". This is always true, but the scope of the purpose can easily change based on who made the toolkit. So it can easily be a programmer's toolkit, or it can be a string parsing toolkit. One is so broad, it could have tool touching everything programming related, while the other is more precise.
SDKs are generally toolkits, in that they try and bundle a set of tools (often of multiple kind) into a single package.
I think the common thread is that a tool does something for you, either completely, or it helps you do it. And a toolkit is simply a set of tools which all perform or help you perform a particular set of activities.
Framework
Frameworks aren't quite as unanimously defined. It seems to be a bit of a blanket term for anything that can frame your code. Which would mean: any structure that underlies or supports your code.
This implies that you build your code against a framework, whereas you build a library against your code.
But, it seems that sometimes the word framework is used in the same sense as toolkit or even library. The .Net Framework is mostly a toolkit, because it's composed of the FCL which is a library, and the CLR, which is a virtual machine. So you would consider it a toolkit to C# development on Windows. Mono being a toolkit for C# development on Linux. Yet they called it a framework. It makes sense to think of it this way too, since it kinds of frame your code, but a frame should more support and hold things together, then do any kind of work, so my opinion is this is not the way you should use the word.
And I think the industry is trying to move into having framework mean an already written program with missing pieces that you must provide or customize. Which I think is a good thing, since toolkit and library are great precise terms for other usages of "framework".
Framework: installed on you machine and allowing you to interact with it. without the framework you can't send programming commands to your machine
Library: aims to solve a certain problem (or several problems related to the same category)
Toolkit: a collection of many pieces of code that can solve multiple problems on multiple issues (just like a toolbox)
It's a little bit subjective I think. The toolkit is the easiest. It's just a bunch of methods, classes that can be use.
The library vs the framework question I make difference by the way to use them. I read somewhere the perfect answer a long time ago. The framework calls your code, but on the other hand your code calls the library.
In relation with the correct answer from Mittag:
a simple example. Let's say you implement the ISerializable interface (.Net) in one of your classes. You make use of the framework qualities of .Net then, rather than it's library qualities. You fill in the "white spots" (as mittag said) and you have the skeleton completed. You must know in advance how the framework is going to "react" with your code. Actually .net IS a framework, and here is where i disagree with the view of Mittag.
The full, complete answer to your question is given very lucidly in Chapter 19 (the whole chapter devoted to just this theme) of this book, which is a very good book by the way (not at all "just for Smalltalk").
Others have noted that .net may be both a framework and a library and a toolkit depending on which part you use but perhaps an example helps. Entity Framework for dealing with databases is a part of .net that does use the inversion of control pattern. You let it know your models it figures out what to do with them. As a programmer it requires you to understand "the mind of the framework", or more realistically the mind of the designer and what they are going to do with your inputs. datareader and related calls, on the other hand, are simply a tool to go get or put data to and from table/view and make it available to you. It would never understand how to take a parent child relationship and translate it from object to relational, you'd use multiple tools to do that. But you would have much more control on how that data was stored, when, transactions, etc.

Preventing XSS exploits using the type system as Joel suggested

In Podcast 58 (about 20 minutes in), Jeff complains about the problems of HTML.Encode() and Joel talks about using the type system to have ordinary strings and HTMLStrings:
A brief political rant about the evil of view engines that fail to HTML
encode by default. The problem with
this design choice is that it is not
“safe by default”, which is always the
wrong choice for a framework or API.
Forget to encode some bit of
user-entered data in one single
stinking place in your web app, and
you will be totally owned with XSS.
Believe it. I know because it’s
happened to us. Multiple times!
Joel maintains that, with a strongly-typed language and the right
framework, it’s possible (in theory)
to completely eliminate XSS — this
would require using a specific data
type, a type that is your only way to
send data to the browser. That data
type would be validated at compile
time.
The comments at the blog post mention using static analysis to find potential weaknesses. The transcript Wiki isn't done yet.
Is it possible to implement Joel's suggestion without having a new ASP.NET framework?
Might it be possible to implement it simply by subclassing every control and enforcing new interfaces based on HTMLString? If most people already subclass controls in order to better able to inject site-specific functionality, wouldn't this be fairly easy to implement?
Would it be worth doing this instead of investing in static analysis?
To use HtmlString everywhere, you would essentially have to rewrite every property and method of every web control. System.String is sealed, so you can't subclass it.
An easier (but still very time consuming) approach would be to use control adapters to replace web controls with safe alternatives. In this case, you would subclass each web control and override the Render methods to HTML-encode dynamic content.

Studying standard library sources

How does one study open-source libraries code, particularly standard libraries?
The code base is often vast and hard to navigate. How to find some function or class definition?
Do I search through downloaded source files?
Do I need cvs/svn for that?
Maybe web-search?
Should I just know the structure of the standard library?
Is there any reference on it?
Or do some IDEs have such features? Or some other tools?
How to do it effectively without one?
What are the best practices of doing this in any open-source libraries?
Is there any convention of how are sources manipulated on Linux/Unix systems?
What are the differences for specific programming languages?
Broad presentation of the subject is highly encouraged.
I mark this 'community wiki' so everyone can rephrase and expand my awkward formulations!
Update: Probably didn't express the problem clear enough. What I want to, is to view just the source code of some specific library class or function. And the problem is mostly about work organization and usability - how do I navigate in the huge pile of sources to find the thing, maybe there are specific tools or approaches? It feels like there should've long existed some solution(s) for that.
One thing to note is that standard libraries are sometimes (often?) optimized more than is good for most production code.
Because they are widely used, they have to perform well over a wide variety of conditions, and may be full of clever tricks and special logic for corner cases.
Maybe they are not the best thing to study as a beginner.
Just a thought.
Well, I think that it's insane to just site down and read a library's code. My approach is to search whenever I come across the need to implement something by myself and then study the way that it's implemented in those libraries.
And there's also allot of projects/libraries with excellent documentation, which I find more important to read than the code. In Unix based systems you often find valuable information in the man pages.
Wow, that's a big question.
The short answer: it depends.
The long answer:
Some libraries provide documentation while others don't. Standard libraries are usually pretty well documented, whether your chosen implementation of the library includes documentation or not. For instance you may have found an implementation of the c standard library without documentation but the c standard has been around long enough that there are hundreds of good reference books available. Documentation with hyperlinks is a very useful way to learn a new API. In any case the first place I would look is the library's main website
For less well known libraries lacking documentation I find two different approaches very helpful.
First is a doc generator. Nearly every language I know of has one. It basically parses an source tree and creates documentation (usually as html or xml) which can be used to learn a library. Some use specially formatted comments in the code to create more complete documentation. JavaDoc is one good example of this. Doc generators for many other languages borrow from JavaDoc.
Second an IDE with a class browser. These act as a sort of on the fly documentation. Some display just the library's interface. Other's include description comments from the library's source.
Both of these will require access to the libraries source (which will come in handy if you intend actually use a library).
Many of these tools and techniques work equally well for closed/proprietary libraries.
The standard Java libraries' source code is available. For a beginning Java programmer these can be a great read. Especially the Collections framework is a good place to start. Take for instance the implementation of ArrayList and learn how you can implement a resizeable array in Java. Most of the source has even useful comments.
The best parts to read are probably whose purpose you can understand immediately. Start with the easy pieces and try to follow all the steps that are hidden behind that single call you make from your own code.
Something I do from time to time :
apt-get source foo
Then new C++ project (or whatever) in Eclipse and import.
=> Wow ! Browsable ! (use F3)

runnable pseudocode?

I am attempting to determine prior art for the following idea:
1) user types in some code in a language called (insert_name_here);
2) user chooses a destination language from a list of well-known output candidates (javascript, ruby, perl, python);
3) the processor translates insert_name_here into runnable code in destination language;
4) the processor then runs the code using the relevant system call based on the chosen language
The reason this works is because there is a pre-established 1 to 1 mapping between all language constructs from insert_name_here to all supported destination languages.
(Disclaimer: This obviously does not produce "elegant" code that is well-tailored to the destination language. It simply does a rudimentary translation that is runnable. The purpose is to allow developers to get a quick-and-dirty implementation of algorithms in several different languages for those cases where they do not feel like re-inventing the wheel, but are required for whatever reason to work with a specific language on a specific project.)
Does this already exist?
The .NET CLR is designed such that C++.Net, C#.Net, and VB.Net all compile to the same machine language, and you can "decompile" that CLI back in to any one of those languages.
So yes, I would say it already exists though not exactly as you describe.
There are converters available for different languages. The problem you are going to have is dealing with libraries. While mapping between language statements might be easy, finding mappings between library functions will be very difficult.
I'm not really sure how useful that type of code generator would be. Why would you want to write something in one language and then immediately convert it to something else? I can see the rationale for 4th Gen languages that convert diagrams or models into code but I don't really see the point of your effort.
Yes, a program that transform a program from one representation to another does exist. It's called a "compiler".
And as to your question whether that is always possible: as long as your target language is at least as powerful as the source language, then it is possible. So, if your target language is Turing-complete, then it is always possible, because there can be no language that is more powerful than a Turing-complete language.
However, there does not need to be a dumb 1:1 mapping.
For example: the Microsoft Volta compiler which compiles CIL bytecode to JavaScript sourcecode has a problem: .NET has threads, JavaScript doesn't. But you can implement threads with continuations. Well, JavaScript doesn't have continuations either, but you can implement continuations with exceptions. So, Volta transforms the CIL to CPS and then implements CPS with exceptions. (Newer versions of JavaScript have semi-coroutines in the form of generators; those could also be used, but Volta is intended to work across a wide range of JavaScript versions, including obviously JScript in Internet Explorer.)
This seems a little bizarre. If you're using the term "prior art" in its most common form, you're discussing a potentially patentable idea. If that is the case, you have:
1/ Published the idea, starting the clock running on patent filing - I'm assuming, perhaps incorrectly, that you're based in the U.S. Other jurisdictions may have other rules.
2/ Told the entire planet your idea, which means it's pretty much useless to try and patent it, unless you act very fast.
If you're not thinking about patenting this and were just using the term "prior art" in a laypersons sense, I apologize. I work for a company that takes patents very seriously and it's drilled into us, in great detail, what we're allowed to do with information before filing.
Having said that, patentable ideas must be novel, useful and non-obvious. I would think that your idea would not pass on the third of these since you're describing a language translator which would have the prior art of the many pascal-to-c and fortran-to-c converters out there.
The one glimmer of hope would be the ability of your idea to generate one of multiple output languages (which p2c and f2c don't do) but I think even that would be covered by the likes of cross compilers (such as gcc) which turn source into one of many different object languages.
IBM has a product called Visual Age Generator in which you code in one (proprietary) language and it's converted into COBOL/C/Java/others to run on different target platforms from PCs to the big honkin' System z mainframes, so there's your first problem (thinking about patenting an idea that IBM, the biggest patenter in the world, is already using).
Tons of them. p2c, f2c, and the original implementation s of C++ and Objective C strike me immediately. Beyond that, it's kind of hard to distinguish what you're describing from any compiler, especially for us old guys whose compilers generated ASM code for an intermediate represetation anyway.