What application virtual machines are written in high level languages? - vm-implementation

What application virtual machines are out there that are written in higher level languages? C/C++ looks like the languages of choice (for obvious reasons).
What I have found on google is at least two written in Java (both meta-circular) :
JikesRVM and Maxine.
Anything else that you have found?

Many Scheme implementations are written in Scheme and although many of those are compilers or interpreters, some of those are VMs,
some CommonLisp implementations are written in CommonLisp and although many of those are compilers or interpreters, some of those are VMs,
the PyPy VM is written in RPython, which is a subset of Python with "syntax and semantics of Python, speed of C, restrictions of Java and compiler error messages as penetrable as MUMPS",
the Squeak Smalltalk VM is written in Slang (a subset of Squeak Smalltalk) and
the Klein Metacircular VM is written entirely in Self.
Of those, the most interesting are Klein and Maxine (whose design is actually based on Klein). Metacircular Lisp and Scheme implementations usually assume the existence of some basic primitive special forms, which then have to be implemented in assembler, C or a limited subset of the language in a low-level style. Squeak and PyPy use a limited subset of the language. Jikes uses "magic" methods and low-level style.
The idea of Klein and Maxine is that everything is written in high-level, object-oriented, expressive, idiomatic style. In the current version of Klein, there are only two tiny places where the style is hampered by some restriction: in the implementation of message sending, you cannot send any messages and in the implementation of object cloning you cannot clone any objects. However, the current compiler can actually inline or even completely optimize away object cloning and message sending, so those two places could be rewritten in normal OO Self style – it's just that nobody has done it yet.
All of that was just metacircular VMs. There's also other VMs written in high-level languages:
HotRuby is a Ruby VM (actually, a YARV VM) written in JavaScript,
Red Sun is a Ruby VM (actually, a YARV VM) written in ActionScript,
Rava is a JVM-like VM written in Ruby by Koichi "ko1" Sasada, the author of YARV and
Ruva is a JVM-like VM written in Ruby

Some more VM implementations are in TCL (tool command language) and lua (sometimes named as java) and some are written in an assembler. Other variants are written in a computer hardware system programming language of the manufacturer.

Related

Advantages of a VM

The majority of languages I have come across utilise a VM, or virtual machine. Languages such as Java (the JVM), Python, Ruby, PHP (the HHVM), etc.
Then there are languages such as C, C++, Haskell, etc. which compile directly to native.
My question is, what is the advantage of using a VM (outside of OS-independence)? Isn't using a VM just creating an extra interpretation step, by going [source code -> bytecode -> native] instead of just [source code -> native]?
Why use a VM when you can compile directly?
EDIT
My understanding is that Python, Ruby, et al. use something akin to a VM, if not exactly fitting under such a definition, where scripts are compiled to an intermediate representation (for Python, e.g. .pyc files).
EDIT 2
Yep. Looked it up. Python, Ruby and PHP all use intermediate representations, but are simply not stored in seperate files but executed by the VM directly. See question : Java "Virtual Machine" vs. Python "Interpreter" parlance?
" Even though Python uses a virtual machine under the covers, from a
user's perspective, one can ignore this detail most of the time. "
An advantage of VM is that, it is much easier to modify some parts of the code on runtime, which is called Reflection. It brings some elegance capabilities. For example, you can ask the user which function/class he want to call, and call the function/class by its STRING name. In Java programs (and maybe some other VM-based languages) users can add additional library to the program in runtime, and the library can be run immediately!
Another advantage is the ability to use advanced garbage collection, because the bytecode's structure is easier to analyze.
Let me note that a virtual machine does not always interpret the code, and therefore it is not always slower than machine code. For example, Java has a component named hotspot which searches for code blocks that are frequently called, and replaces their bytecode with native code (machine code). For instance, if a for loop is called for, say , 100+ times, hotspot converts it to machine-code, so that in the next calls it will run natively! This insures that just the bottlenecks of your code are running natively, while the rest part allows for the above advantages.
P.S. It is not impossible to compile the code directly to native code. Many VM-based languages have compiler versions (e.g. there is a compiler for PHP: http://www.phpcompiler.org). However, remember that you are disabling some of the above features by compiling the whole program to native code.
P.S. The [source-code -> byte-code] part is not a problem, it is compiled once and does not relate to execution time. I presumed you are asking why they do not execute the machine code while it is possible.
Python, Ruby, and PhP do not utilize VMs. They are, however, interpreted.
To answer your actual question: Java utilizes a VM in order to add some distance between the operating system/hardware and the code being executed. The goal there was security and hardiness (hardiness meaning there was a lower likelihood of code having an averse effect on other processes in the system.)
All the languages you listed are interpreted so I think what you may have actually meant to ask was the difference between interpreted and compiled languages. Interpreted languages are cross-platform. That is the biggest, and main, advantage. You need not compile them for each different set of hardware or operating system they operate on, and instead they will simply work everywhere.
The advantage of a compiled language, traditionally, is speed and efficiency.
Because a VM allows for the same set of instructions to be run on my different operating systems (provided they have the interperetor)
Let's take Java as an example. Java gets compiled into bytecode, which is basically a set of operations for a computer to follow. However, not all processors in computers understand the same set of instructions the same way - meaning, what one set of native instruction means on computer A could be something different on computer B.
As a result, a VM is run, with one specific to each computer. This way, the Java bytecode that is written is standardized, and only the interpreter has to work to convert it to machine language.
OS independence is a big part of it but you also get abstractions from other things like CPUs... the same Java code can execute on ARM, x86, whatever without modification so long as there is a JVM in place.

Name of process of using multiple languages together

I'm wondering what is the formal name of process of using multiple languages together.
Lets say I'm writing a program in C++ which calls Java functions (and uses Java libraries) and sometimes calls Python functions. Then it gathers the results from those calls and continues execution.
How would you name this process?
Depending on how many different languages you use, how small the subproblems are for which you use different languages, how specific those languages are for the subproblem at hand, and how many of those languages you designed yourself to solve that specific subproblem, it might be called Language-Oriented Programming, Polyglot Programming or just Programming.
For example, just using C++ you actually use three languages: C++ itself, the C++ template language (which is basically a hybrid functional / logic programming language) and the C++ macro language. Throw in make and sh for building, JSON for configuration, roff for documenting, and Tcl for testing, and you are looking at 8 languages. However, I would just call that normal Programming, nothing special about it. The same applies to a typical web project combining HTML, CSS, ECMAScript, JSON, SQL, Java, XML, sh.
Language-Oriented Programming is at the other end of the spectrum. In LOP, you break your problem apart into ever smaller subproblems, sub-subproblems and so on, and then you solve every subproblem with a language that is most suited for that particular subproblem, possibly one you designed specifically for that subproblem. Basically, in LOP, you use Languages the same way you use Objects in OOP, Procedures in PP, Functions in FP and so on. Typically, those languages are Domain-Specific and often not Turing-complete.
Polyglot Programming is somewhere in the middle: you use different languages for different larger components, but not a the same level of abstraction as, say, individual objects, and you usually use pre-existing Turing-complete general-purpose languages, instead of designing them yourself. For example, trend.ly used "Smalltalk for thinking, Java for brute-force computing, ECMAScript for visualizing, Ruby for gluing those three together and sh for deployment". Your description sounds most like Polyglot Programming to me.
Note that those definitions are very subjective: for example, in Lisp, designing and implementing new languages is so obvious, natural and trivial, that no Lisp programmer would call what he does "Language-Oriented Programming". They just call it "Programming".
I wouldn't put a hard and fast rule on it, saying "between 5 and 10 languages it's Polyglot, more is Language-Oriented, less is just Programming". It's more a mindset: when you look at a problem, what's the first thing that comes to mind "How can I solve this in this language", "What would be the best language to solve this in" or "What would the perfect language to solve this problem in look like"?
It's called polyglot programming.

Framework vs. Toolkit vs. Library [duplicate]

This question already has answers here:
What is the difference between a framework and a library? [closed]
(22 answers)
Closed 6 years ago.
What is the difference between a Framework, a Toolkit and a Library?
The most important difference, and in fact the defining difference between a library and a framework is Inversion of Control.
What does this mean? Well, it means that when you call a library, you are in control. But with a framework, the control is inverted: the framework calls you. (This is called the Hollywood Principle: Don't call Us, We'll call You.) This is pretty much the definition of a framework. If it doesn't have Inversion of Control, it's not a framework. (I'm looking at you, .NET!)
Basically, all the control flow is already in the framework, and there's just a bunch of predefined white spots that you can fill out with your code.
A library on the other hand is a collection of functionality that you can call.
I don't know if the term toolkit is really well defined. Just the word "kit" seems to suggest some kind of modularity, i.e. a set of independent libraries that you can pick and choose from. What, then, makes a toolkit different from just a bunch of independent libraries? Integration: if you just have a bunch of independent libraries, there is no guarantee that they will work well together, whereas the libraries in a toolkit have been designed to work well together – you just don't have to use all of them.
But that's really just my interpretation of the term. Unlike library and framework, which are well-defined, I don't think that there is a widely accepted definition of toolkit.
Martin Fowler discusses the difference between a library and a framework in his article on Inversion of Control:
Inversion of Control is a key part of
what makes a framework different to a
library. A library is essentially a
set of functions that you can call,
these days usually organized into
classes. Each call does some work and
returns control to the client.
A framework embodies some abstract
design, with more behavior built in.
In order to use it you need to insert
your behavior into various places in
the framework either by subclassing or
by plugging in your own classes. The
framework's code then calls your code
at these points.
To summarize: your code calls a library but a framework calls your code.
Diagram
If you are a more visual learner, here is a diagram that makes it clearer:
(Credits: http://tom.lokhorst.eu/2010/09/why-libraries-are-better-than-frameworks)
The answer provided by Barrass is probably the most complete. However, the explanation could easily be stated more clearly. Most people miss the fact that these are all nested concepts. So let me lay it out for you.
When writing code:
eventually you discover sections of code that you're repeating in your program, so you refactor those into Functions/Methods.
eventually, after having written a few programs, you find yourself copying functions you already made into new programs. To save yourself time you bundle those functions into Libraries.
eventually you find yourself creating the same kind of user interfaces every time you make use of certain libraries. So you refactor your work and create a Toolkit that allows you to create your UIs more easily from generic method calls.
eventually, you've written so many apps that use the same toolkits and libraries that you create a Framework that has a generic version of this boilerplate code already provided so all you need to do is design the look of the UI and handle the events that result from user interaction.
Generally speaking, this completely explains the differences between the terms.
Introduction
There are various terms relating to collections of related code, which have both historical (pre-1994/5 for the purposes of this answer) and current implications, and the reader should be aware of both, particularly when reading classic texts on computing/programming from the historic era.
Library
Both historically, and currently, a library is a collection of code relating to a specific task, or set of closely related tasks which operate at roughly the same level of abstraction. It generally lacks any purpose or intent of its own, and is intended to be used by (consumed) and integrated with client code to assist client code in executing its tasks.
Toolkit
Historically, a toolkit is a more focused library, with a defined and specific purpose. Currently, this term has fallen out of favour, and is used almost exclusively (to this author's knowledge) for graphical widgets, and GUI components in the current era. A toolkit will most often operate at a higher layer of abstraction than a library, and will often consume and use libraries itself. Unlike libraries, toolkit code will often be used to execute the task of the client code, such as building a window, resizing a window, etc. The lower levels of abstraction within a toolkit are either fixed, or can themselves be operated on by client code in a proscribed manner. (Think Window style, which can either be fixed, or which could be altered in advance by client code.)
Framework
Historically, a framework was a suite of inter-related libraries and modules which were separated into either 'General' or 'Specific' categories. General frameworks were intended to offer a comprehensive and integrated platform for building applications by offering general functionality, such as cross platform memory management, multi-threading abstractions, dynamic structures (and generic structures in general). Historical general frameworks (Without dependency injection, see below) have almost universally been superseded by polymorphic templated (parameterised) packaged language offerings in OO languages, such as the STL for C++, or in packaged libraries for non-OO languages (guaranteed Solaris C headers). General frameworks operated at differing layers of abstraction, but universally low level, and like libraries relied on the client code carrying out it's specific tasks with their assistance.
'Specific' frameworks were historically developed for single (but often sprawling) tasks, such as "Command and Control" systems for industrial systems, and early networking stacks, and operated at a high level of abstraction and like toolkits were used to carry out execution of the client codes tasks.
Currently, the definition of a framework has become more focused and taken on the "Inversion of Control" principle as mentioned elsewhere as a guiding principle, so program flow, as well as execution is carried out by the framework. Frameworks are still however targeted either towards a specific output; an application for a specific OS for example (MFC for MS Windows for example), or for more general purpose work (Spring framework for example).
SDK: "Software Development Kit"
An SDK is a collection of tools to assist the programmer to create and deploy code/content which is very specifically targeted to either run on a very particular platform or in a very particular manner. An SDK can consist of simply a set of libraries which must be used in a specific way only by the client code and which can be compiled as normal, up to a set of binary tools which create or adapt binary assets to produce its (the SDK's) output.
Engine
An Engine (In code collection terms) is a binary which will run bespoke content or process input data in some way. Game and Graphics engines are perhaps the most prevalent users of this term, and are almost universally used with an SDK to target the engine itself, such as the UDK (Unreal Development Kit) but other engines also exist, such as Search engines and RDBMS engines.
An engine will often, but not always, allow only a few of its internals to be accessible to its clients. Most often to either target a different architecture, change the presentation of the output of the engine, or for tuning purposes. Open Source Engines are by definition open to clients to change and alter as required, and some propriety engines are fixed completely. The most often used engines in the world however, are almost certainly JavaScript Engines. Embedded into every browser everywhere, there are a whole host of JavaScript engines which will take JavaScript as an input, process it, and then output to render.
API: "Application Programming Interface"
The final term I am answering is a personal bugbear of mine: API, was historically used to describe the external interface of an application or environment which, itself was capable of running independently, or at least of carrying out its tasks without any necessary client intervention after initial execution. Applications such as Databases, Word Processors and Windows systems would expose a fixed set of internal hooks or objects to the external interface which a client could then call/modify/use, etc to carry out capabilities which the original application could carry out. API's varied between how much functionality was available through the API, and also, how much of the core application was (re)used by the client code. (For example, a word processing API may require the full application to be background loaded when each instance of the client code runs, or perhaps just one of its linked libraries; whereas a running windowing system would create internal objects to be managed by itself and pass back handles to the client code to be utilised instead.
Currently, the term API has a much broader range, and is often used to describe almost every other term within this answer. Indeed, the most common definition applied to this term is that an API offers up a contracted external interface to another piece of software (Client code to the API). In practice this means that an API is language dependent, and has a concrete implementation which is provided by one of the above code collections, such as a library, toolkit, or framework.
To look at a specific area, protocols, for example, an API is different to a protocol which is a more generic term representing a set of rules, however an individual implementation of a specific protocol/protocol suite that exposes an external interface to other software would most often be called an API.
Remark
As noted above, historic and current definitions of the above terms have shifted, and this can be seen to be down to advances in scientific understanding of the underlying computing principles and paradigms, and also down to the emergence of particular patterns of software. In particular, the GUI and Windowing systems of the early nineties helped to define many of these terms, but since the effective hybridisation of OS Kernel and Windowing system for mass consumer operating systems (bar perhaps Linux), and the mass adoption of dependency injection/inversion of control as a mechanism to consume libraries and frameworks, these terms have had to change their respective meanings.
P.S. (A year later)
After thinking carefully about this subject for over a year I reject the IoC principle as the defining difference between a framework and a library. There ARE a large number of popular authors who say that it is, but there are an almost equal number of people who say that it isn't. There are simply too many 'Frameworks' out there which DO NOT use IoC to say that it is the defining principle. A search for embedded or micro controller frameworks reveals a whole plethora which do NOT use IoC and I now believe that the .NET language and CLR is an acceptable descendant of the "general" framework. To say that IoC is the defining characteristic is simply too rigid for me to accept I'm afraid, and rejects out of hand anything putting itself forward as a framework which matches the historical representation as mentioned above.
For details of non-IoC frameworks, see, as mentioned above, many embedded and micro frameworks, as well as any historical framework in a language that does not provide callback through the language (OK. Callbacks can be hacked for any device with a modern register system, but not by the average programmer), and obviously, the .NET framework.
A library is simply a collection of methods/functions wrapped up into a package that can be imported into a code project and re-used.
A framework is a robust library or collection of libraries that provides a "foundation" for your code. A framework follows the Inversion of Control pattern. For example, the .NET framework is a large collection of cohesive libraries in which you build your application on top of. You can argue there isn't a big difference between a framework and a library, but when people say "framework" it typically implies a larger, more robust suite of libraries which will play an integral part of an application.
I think of a toolkit the same way I think of an SDK. It comes with documentation, examples, libraries, wrappers, etc. Again, you can say this is the same as a framework and you would probably be right to do so.
They can almost all be used interchangeably.
very, very similar, a framework is usually a bit more developed and complete then a library, and a toolkit can simply be a collection of similar librarys and frameworks.
a really good question that is maybe even the slightest bit subjective in nature, but I believe that is about the best answer I could give.
Library
I think it's unanimous that a library is code already coded that you can use so as not to have to code it again. The code must be organized in a way that allows you to look up the functionality you want and use it from your own code.
Most programming languages come with standard libraries, especially some code that implements some kind of collection. This is always for the convenience that you don't have to code these things yourself. Similarly, most programming languages have construct to allow you to look up functionality from libraries, with things like dynamic linking, namespaces, etc.
So code that finds itself often needed to be re-used is great code to be put inside a library.
Toolkit
A set of tools used for a particular purpose. This is unanimous. The question is, what is considered a tool and what isn't. I'd say there's no fixed definition, it depends on the context of the thing calling itself a toolkit. Example of tools could be libraries, widgets, scripts, programs, editors, documentation, servers, debuggers, etc.
Another thing to note is the "particular purpose". This is always true, but the scope of the purpose can easily change based on who made the toolkit. So it can easily be a programmer's toolkit, or it can be a string parsing toolkit. One is so broad, it could have tool touching everything programming related, while the other is more precise.
SDKs are generally toolkits, in that they try and bundle a set of tools (often of multiple kind) into a single package.
I think the common thread is that a tool does something for you, either completely, or it helps you do it. And a toolkit is simply a set of tools which all perform or help you perform a particular set of activities.
Framework
Frameworks aren't quite as unanimously defined. It seems to be a bit of a blanket term for anything that can frame your code. Which would mean: any structure that underlies or supports your code.
This implies that you build your code against a framework, whereas you build a library against your code.
But, it seems that sometimes the word framework is used in the same sense as toolkit or even library. The .Net Framework is mostly a toolkit, because it's composed of the FCL which is a library, and the CLR, which is a virtual machine. So you would consider it a toolkit to C# development on Windows. Mono being a toolkit for C# development on Linux. Yet they called it a framework. It makes sense to think of it this way too, since it kinds of frame your code, but a frame should more support and hold things together, then do any kind of work, so my opinion is this is not the way you should use the word.
And I think the industry is trying to move into having framework mean an already written program with missing pieces that you must provide or customize. Which I think is a good thing, since toolkit and library are great precise terms for other usages of "framework".
Framework: installed on you machine and allowing you to interact with it. without the framework you can't send programming commands to your machine
Library: aims to solve a certain problem (or several problems related to the same category)
Toolkit: a collection of many pieces of code that can solve multiple problems on multiple issues (just like a toolbox)
It's a little bit subjective I think. The toolkit is the easiest. It's just a bunch of methods, classes that can be use.
The library vs the framework question I make difference by the way to use them. I read somewhere the perfect answer a long time ago. The framework calls your code, but on the other hand your code calls the library.
In relation with the correct answer from Mittag:
a simple example. Let's say you implement the ISerializable interface (.Net) in one of your classes. You make use of the framework qualities of .Net then, rather than it's library qualities. You fill in the "white spots" (as mittag said) and you have the skeleton completed. You must know in advance how the framework is going to "react" with your code. Actually .net IS a framework, and here is where i disagree with the view of Mittag.
The full, complete answer to your question is given very lucidly in Chapter 19 (the whole chapter devoted to just this theme) of this book, which is a very good book by the way (not at all "just for Smalltalk").
Others have noted that .net may be both a framework and a library and a toolkit depending on which part you use but perhaps an example helps. Entity Framework for dealing with databases is a part of .net that does use the inversion of control pattern. You let it know your models it figures out what to do with them. As a programmer it requires you to understand "the mind of the framework", or more realistically the mind of the designer and what they are going to do with your inputs. datareader and related calls, on the other hand, are simply a tool to go get or put data to and from table/view and make it available to you. It would never understand how to take a parent child relationship and translate it from object to relational, you'd use multiple tools to do that. But you would have much more control on how that data was stored, when, transactions, etc.

runnable pseudocode?

I am attempting to determine prior art for the following idea:
1) user types in some code in a language called (insert_name_here);
2) user chooses a destination language from a list of well-known output candidates (javascript, ruby, perl, python);
3) the processor translates insert_name_here into runnable code in destination language;
4) the processor then runs the code using the relevant system call based on the chosen language
The reason this works is because there is a pre-established 1 to 1 mapping between all language constructs from insert_name_here to all supported destination languages.
(Disclaimer: This obviously does not produce "elegant" code that is well-tailored to the destination language. It simply does a rudimentary translation that is runnable. The purpose is to allow developers to get a quick-and-dirty implementation of algorithms in several different languages for those cases where they do not feel like re-inventing the wheel, but are required for whatever reason to work with a specific language on a specific project.)
Does this already exist?
The .NET CLR is designed such that C++.Net, C#.Net, and VB.Net all compile to the same machine language, and you can "decompile" that CLI back in to any one of those languages.
So yes, I would say it already exists though not exactly as you describe.
There are converters available for different languages. The problem you are going to have is dealing with libraries. While mapping between language statements might be easy, finding mappings between library functions will be very difficult.
I'm not really sure how useful that type of code generator would be. Why would you want to write something in one language and then immediately convert it to something else? I can see the rationale for 4th Gen languages that convert diagrams or models into code but I don't really see the point of your effort.
Yes, a program that transform a program from one representation to another does exist. It's called a "compiler".
And as to your question whether that is always possible: as long as your target language is at least as powerful as the source language, then it is possible. So, if your target language is Turing-complete, then it is always possible, because there can be no language that is more powerful than a Turing-complete language.
However, there does not need to be a dumb 1:1 mapping.
For example: the Microsoft Volta compiler which compiles CIL bytecode to JavaScript sourcecode has a problem: .NET has threads, JavaScript doesn't. But you can implement threads with continuations. Well, JavaScript doesn't have continuations either, but you can implement continuations with exceptions. So, Volta transforms the CIL to CPS and then implements CPS with exceptions. (Newer versions of JavaScript have semi-coroutines in the form of generators; those could also be used, but Volta is intended to work across a wide range of JavaScript versions, including obviously JScript in Internet Explorer.)
This seems a little bizarre. If you're using the term "prior art" in its most common form, you're discussing a potentially patentable idea. If that is the case, you have:
1/ Published the idea, starting the clock running on patent filing - I'm assuming, perhaps incorrectly, that you're based in the U.S. Other jurisdictions may have other rules.
2/ Told the entire planet your idea, which means it's pretty much useless to try and patent it, unless you act very fast.
If you're not thinking about patenting this and were just using the term "prior art" in a laypersons sense, I apologize. I work for a company that takes patents very seriously and it's drilled into us, in great detail, what we're allowed to do with information before filing.
Having said that, patentable ideas must be novel, useful and non-obvious. I would think that your idea would not pass on the third of these since you're describing a language translator which would have the prior art of the many pascal-to-c and fortran-to-c converters out there.
The one glimmer of hope would be the ability of your idea to generate one of multiple output languages (which p2c and f2c don't do) but I think even that would be covered by the likes of cross compilers (such as gcc) which turn source into one of many different object languages.
IBM has a product called Visual Age Generator in which you code in one (proprietary) language and it's converted into COBOL/C/Java/others to run on different target platforms from PCs to the big honkin' System z mainframes, so there's your first problem (thinking about patenting an idea that IBM, the biggest patenter in the world, is already using).
Tons of them. p2c, f2c, and the original implementation s of C++ and Objective C strike me immediately. Beyond that, it's kind of hard to distinguish what you're describing from any compiler, especially for us old guys whose compilers generated ASM code for an intermediate represetation anyway.

What's the difference between a "script" and an "application"?

I'm referring to distinctions such as in this answer:
...bash isn't for writing applications it's for, well, scripting. So sure, your application might have some housekeeping scripts but don't go writing critical-business-logic.sh because another language is probably better for stuff like that.
As programmer who's worked in many languages, this seems to be C, Java and other compiled language snobbery. I'm not looking for reenforcement of my opinion or hand-wavy answers. Rather, I genuinely want to know what technical differences are being referred to.
(And I use C in my day job, so I'm not just being defensive.)
Traditionally a program is compiled and a script is interpreted, but that is not really important anymore. You can generate a compiled version of most scripts if you really want to, and other 'compiled' languages like Java are in fact interpreted (at the byte code level.)
A more modern definition might be that a program is intended to be used by a customer (perhaps an internal one) and thus should include documentation and support, while a script is primarily intended for the use of the author.
The web is an interesting counter example. We all enjoy looking things up with the Google search engine. The bulk of the code that goes into creating the 'database' it references is used only by its authors and maintainers. Does that make it a script?
I would say that an application tends to be used interactively, where a script would run its course, suitable for batch work. I don't think it's a concrete distinction.
Usually, it is "script" versus "program".
I am with you that this distinction is mostly "compiled language snobbery", or to quote Larry Wall and take the other side of the fence, "a script is what the actors have, a programme is given to the audience".
This is an interesting topic, and I don't think there are very good guidelines for the differentiating a "script" and a "application."
Let's take a look at some Wikipedia articles to get a feel of the distinction.
Script (Wikipedia -> Scripting language):
A scripting language, script language or extension language, is a programming language that controls a software application. "Scripts" are often treated as distinct from "programs", which execute independently from any other application. At the same time they are distinct from the core code of the application, which is usually written in a different language, and by being accessible to the end user they enable the behavior of the application to be adapted to the user's needs.
Application (Wikipedia -> Application software -> Terminology)
In computer science, an application is a computer program designed to help people perform a certain type of work. An application thus differs from an operating system (which runs a computer), a utility (which performs maintenance or general-purpose chores), and a programming language (with which computer programs are created). Depending on the work for which it was designed, an application can manipulate text, numbers, graphics, or a combination of these elements.
Reading the above entries seems to suggest that the distinction is that a script is "hosted" by another piece of software, while an application is not. I suppose that can be argued, such as shell scripts controlling the behavior of the shell, and perl scripts controlling the behavior of the interpreter to perform desired operations. (I feel this may be a little bit of a stretch, so I may not completely agree with it.)
When it comes down to it, it is in my opinion that the colloquial distinction can be made in terms of the scale of the program. Scripts are generally smaller in scale when compared to applications.
Also, in terms of the purpose, a script generally performs tasks that needs taken care of, say for example, build scripts that produce multiple release versions for a certain piece of software. On the otherhand, applications are geared toward providing functionality that is more refined and geared toward an end user. For example, Notepad or Firefox.
John Ousterhout (the inventor of TCL) has a good article at http://www.tcl.tk/doc/scripting.html where he proposes a distinction between system programming languages (for implementing building blocks, emphasis on correctness, type safety) vs scripting languages (for combining building blocks, emphasis on responsiveness to changing environments and requirements, easy conversion in and out of textual representations). If you go with that categorisation system, then 99% of programmers are doing jobs that are more appropriate to scripting languages than to system programming languages.
A script tends to be a series of commands that starts, runs, and terminates. It often requires no/little human interaction. An application is a "program"... it often requires human interaction, it tends to be larger.
Script to me implies line-by-line interpretation of the code. You can open a script and view its programmer-readable contents. An application implies a stand-alone compiled executable.
It's often just a semantic argument, or even a way of denigrating certain programming languages. As far as I'm concerned, a "script" is a type of program, and the exact definition is somewhat vague and varies with context.
I might use the term "script" to mean a program that primarily executes linearly, rather than with lots of sequential logic or subroutines, much like a "script" in Hollywood is a linear sequence of instructions for an actor to execute. I might use it to mean a program that is written in a language embedded inside a larger program, for the purpose of driving that program. For example, automating tasks under the old Mac OS with AppleScript, or driving a program that exposes itself in some way with an embedded TCL interface.
But in all those cases, a script is a type of program.
The term "scripting language" has been used for dynamically interpreted (sometimes compiled) languages, usually these have a lot of common features such as very high level instructions, built in hashes and arbitrary-length lists and other high level data structures, etc. But those languages are capable of very large, complicated, modular, well-designed programs, so if you think of a "script" as something other than a program, that term might confuse you.
See also Is it a Perl program or a Perl script? in perlfaq1.
A script generally runs as part of a larger application inside a scripting engine
eg. JavaScript -> Browser
This is in contrast to both traditional static typed compiled languages and to dynamic languages, where the code is intended to form the main part of the application.
An application is a collection of scripts geared toward a common set of problems.
A script is a bit of code for performing one fairly specific task.
IMO, the difference has nothing whatsoever to do with the language that's used. It's possible to write a complex application with bash, and it's possible to write a simple script with C++.
Personally, I think the separation is a step back from the actual implementation.
In my estimation, an application is planned. It has multiple goals, it has multiple deliverables. There are tasks set aside at design time in advance of coding that the application must meet.
A script however, is just thrown together as suits, and little planning is involved.
Lack of proper planning does not however downgrade you to a script. Possibly, it makes your application a poorly organized collection of poorly planned scripts.
Further more, an application can contain scripts that aggregated comprise the whole. But a script can only reference an application.
Taking perl as an example, you can write perl scripts or perl applications.
A script would imply a single file or a single namespace. (e.g. updateFile.pl).
An application would be something made up of a collection of files or namespaces/classes (e.g. an OO-designed perl application with many .pm module files).
An application is big and will be used over and over by people and maybe sold to a customer.
A script starts out small, stays small if you're lucky, is rarely sold to a customer, and might either be run automatically or fall into disuse.
What about:
Script:
A script is text file (or collection of text files) of programming statements written in a language which allows individual statements written in it to be interpreted to machine executable code directly before each is executed and with the intention of this occurring.
Application:
An application is any computer program whose primary functionality involves providing service to a human Actor.
A script-based program written in a scripting language can therefore, theoretically, have its textual statements altered while the script is being executed (at great risk of , of course). The analogous situation for compiled programs is flipping bits in memory.
Any takers? :)
First of all, I would like to make it crystal clear that a script is a program. In other words, a script is a set of instructions.
Program:
A set of instructions which is going to be compiled is known as a Program.
Script:
A set of instructions which is going to be interpreted is known as a Script.
#Jeff's answer is good. My favorite explanation is
Many (most?) scripting languages are interpreted, and few compiled
languages are considered to be scripting languages, but the question
of compiled vs. interpreted is only loosely connected to the question
of "scripting" vs. "serious" languages.
A lot of the problem here is that "scripting" is a pretty vague
designation -- it means a language that's convenient for writing
scripts in, as opposed to writing "full-blown programs" (or
applications). But how does one distinguish a complex script from a
simple application? That's an essentially unanswerable question.
Generally, a script is a series of commands applied to some set of
data, possibly in a user-defined order... but then, one could stretch
that description to apply to Photoshop, which is clearly a major
application. Scripts are generally smaller than applications, do
some well-defined thing and are "simpler" to use, and typically can
be decomposed into a clear series of sub-operations, but all of these
things are subjective.
Referenced from here.
I think that there is no matter at all whether code is compiled or interpreted.
The true difference is in core logic of code:
If code makes new functionality that is not implemented in other programs in system - it's a program. It even can be manipulated by a script.
If code is MAINLY manipulates by actions of other programs and total result is MAINLY the results of work of manipulated programs - it's a script. Literally a script of actions for some programs.
Actually the difference between a script ( or a scripting language) and an application is that a script don't require it to be compiled into machine language.. You run the source of the script with an interpreter.. A application compiles the source into machine code so that you can run it as a stand alone application.
I would say a script is usually a set of commands or instructions written in plain text that are executed by a hosting application (browser, command interpreter or shell,...).
It does not mean it's not powerfull or not compiled in some way when it's actually executed. But a script cannot do anything by itself, it's just plain text.
By nature it can be a fragment only, needing to be combined to build a program or an application, but extended and fully developed scripts or set of scripts can be considered programs or applications when executed by the host, just like a bunch of source files can become an application once compiled.
A scripting language doesn't have a standard library or platform (or not much of one). It's small and light, designed to be embedded into a larger application. Bash and Javascript are great examples of scripting languages because they rely absolutely on other programs for their functionality.
Using this definition, a script is code designed to drive a larger application (suite). A Javascript might call on Firefox to open windows or manipulate the DOM. A Bash script executes existing programs or other scripts and connects them together with pipes.
You also ask why not scripting languages, so:
Are there even any unit-testing tools for scripting languages? That seems a very important tool for "real" applications that is completely missing. And there's rarely any real library bindings for scripting languages.
Most of the times, scripts could be replaced with a real, light language like Python or Ruby anyway.