What is instrumentation? - language-agnostic

I've heard this term used a lot in the same context as logging, but I can't seem to find a clear definition of what it actually is.
Is it simply a more general class of logging/monitoring tools and activities?
Please provide sample code/scenarios when/how instrumentation should be used.

I write tools that perform instrumentation. So here is what I think it is.
DLL rewriting. This is what tools like Purify and Quantify do. A previous reply to this question said that they instrument post-compile/link. That is not correct. Purify and Quantify instrument the DLL the first time it is executed after a compile/link cycle, then cache the result so that it can be used more quickly next time around. For large applications, profiling the DLLs can be very time consuming. It is also problematic - at a company I worked at between 1998-2000 we had a large 2 million line app that would take 4 hours to instrument, and 2 of the DLLs would randomly crash during instrumentation and if either failed you would have do delete both of them, then start over.
In place instrumentation. This is similar to DLL rewriting, except that the DLL is not modified and the image on the disk remains untouched. The DLL functions are hooked appropriately to the task required when the DLL is first loaded (either during startup or after a call to LoadLibrary(Ex). You can see techniques similar to this in the Microsoft Detours library.
On-the-fly instrumentation. Similar to in-place but only actually instruments a method the first time the method is executed. This is more complex than in-place and delays the instrumentation penalty until the first time the method is encountered. Depending on what you are doing, that could be a good thing or a bad thing.
Intermediate language instrumentation. This is what is often done with Java and .Net languages (C~, VB.Net, F#, etc). The language is compiled to an intermediate language which is then executed by a virtual machine. The virtual machine provides an interface (JVMTI for Java, ICorProfiler(2) for .Net) which allows you to monitor what the virtual machine is doing. Some of these options allow you to modify the intermediate language just before it gets compiled to executable instructions.
Intermediate language instrumentation via reflection. Java and .Net both provide reflection APIs that allow the discovery of metadata about methods. Using this data you can create new methods on the fly and instrument existing methods just as with the previously mentioned Intermediate language instrumentation.
Compile time instrumentation. This technique is used at compile time to insert appropriate instructions into the application during compilation. Not often used, a profiling feature of Visual Studio provides this feature. Requires a full rebuild and link.
Source code instrumentation. This technique is used to modify source code to insert appropriate code (usually conditionally compiled so you can turn it off).
Link time instrumentation. This technique is only really useful for replacing the default memory allocators with tracing allocators. An early example of this was the Sentinel memory leak detector on Solaris/HP in the early 1990s.
The various in-place and on-the-fly instrumentation methods are fraught with danger as it is very hard to stop all threads safely and modify the code without running the risk of requiring an API call that may want to access a lock which is held by a thread you've just paused - you don't want to do that, you'll get a deadlock. You also have to check if any of the other threads are executing that method, because if they are you can't modify it.
The virtual machine based instrumentation methods are much easier to use as the virtual machine guarantees that you can safely modify the code at that point.
(EDIT - this item added later) IAT hooking instrumentation. This involved modifying the import addess table for functions linked against in other DLLs/Shared Libraries. This type of instrumentation is probably the simplest method to get working, you do not need to know how to disassemble and modify existing binaries, or do the same with virtual machine opcodes. You just patch the import table with your own function address and call the real function from your hook. Used in many commercial and open source tools.
I think I've covered them all, hope that helps.

instrumentation is usually used in dynamic code analysis.
it differs from logging as instrumentation is usually done automatically by software, while logging needs human intelligence to insert the logging code.

It's a general term for doing something to your code necessary for some further analysis.
Especially for languages like C or C++, there are tools like Purify or Quantify that profile memory usage, performance statistics, and the like. To make those profiling programs work correctly, an "instrumenting" step is necessary to insert the counters, array-boundary checks, etc that is used by the profiling programs. Note that in the Purify/Quantify scenario, the instrumentation is done automatically as a post-compilation step (actually, it's an added step to the linking process) and you don't touch your source code.
Some of that is less necessary with dynamic or VM code (i.e. profiling tools like OptimizeIt are available for Java that does a lot of what Quantify does, but no special linking is required) but that doesn't negate the concept.

A excerpt from wikipedia article
In context of computer programming,instrumentation refers to an
ability to monitor or measure the level of a product's performance, to
diagnose errors and to write trace information. Programmers implement
instrumentation in the form of code instructions that monitor specific
components in a system (for example, instructions may output logging
information to appear on screen). When an application contains
instrumentation code, it can be managed using a management tool.
Instrumentation is necessary to review the performance of the
application. Instrumentation approaches can be of two types, source
instrumentation and binary instrumentation.

Whatever Wikipedia says, there is no standard / widely agreed definition for code instrumentation in IT industry.
Please consider, instrumentation is a noun derived from instrument which has very broad meaning.
"Code" is also everything in IT, I mean - data, services, everything.
Hence, code instrumentation is a set of applications that is so wide ... not worth giving it a separate name ;-).
That's probably why this Wikipedia article is only a stub.

Related

Advantages of a VM

The majority of languages I have come across utilise a VM, or virtual machine. Languages such as Java (the JVM), Python, Ruby, PHP (the HHVM), etc.
Then there are languages such as C, C++, Haskell, etc. which compile directly to native.
My question is, what is the advantage of using a VM (outside of OS-independence)? Isn't using a VM just creating an extra interpretation step, by going [source code -> bytecode -> native] instead of just [source code -> native]?
Why use a VM when you can compile directly?
EDIT
My understanding is that Python, Ruby, et al. use something akin to a VM, if not exactly fitting under such a definition, where scripts are compiled to an intermediate representation (for Python, e.g. .pyc files).
EDIT 2
Yep. Looked it up. Python, Ruby and PHP all use intermediate representations, but are simply not stored in seperate files but executed by the VM directly. See question : Java "Virtual Machine" vs. Python "Interpreter" parlance?
" Even though Python uses a virtual machine under the covers, from a
user's perspective, one can ignore this detail most of the time. "
An advantage of VM is that, it is much easier to modify some parts of the code on runtime, which is called Reflection. It brings some elegance capabilities. For example, you can ask the user which function/class he want to call, and call the function/class by its STRING name. In Java programs (and maybe some other VM-based languages) users can add additional library to the program in runtime, and the library can be run immediately!
Another advantage is the ability to use advanced garbage collection, because the bytecode's structure is easier to analyze.
Let me note that a virtual machine does not always interpret the code, and therefore it is not always slower than machine code. For example, Java has a component named hotspot which searches for code blocks that are frequently called, and replaces their bytecode with native code (machine code). For instance, if a for loop is called for, say , 100+ times, hotspot converts it to machine-code, so that in the next calls it will run natively! This insures that just the bottlenecks of your code are running natively, while the rest part allows for the above advantages.
P.S. It is not impossible to compile the code directly to native code. Many VM-based languages have compiler versions (e.g. there is a compiler for PHP: http://www.phpcompiler.org). However, remember that you are disabling some of the above features by compiling the whole program to native code.
P.S. The [source-code -> byte-code] part is not a problem, it is compiled once and does not relate to execution time. I presumed you are asking why they do not execute the machine code while it is possible.
Python, Ruby, and PhP do not utilize VMs. They are, however, interpreted.
To answer your actual question: Java utilizes a VM in order to add some distance between the operating system/hardware and the code being executed. The goal there was security and hardiness (hardiness meaning there was a lower likelihood of code having an averse effect on other processes in the system.)
All the languages you listed are interpreted so I think what you may have actually meant to ask was the difference between interpreted and compiled languages. Interpreted languages are cross-platform. That is the biggest, and main, advantage. You need not compile them for each different set of hardware or operating system they operate on, and instead they will simply work everywhere.
The advantage of a compiled language, traditionally, is speed and efficiency.
Because a VM allows for the same set of instructions to be run on my different operating systems (provided they have the interperetor)
Let's take Java as an example. Java gets compiled into bytecode, which is basically a set of operations for a computer to follow. However, not all processors in computers understand the same set of instructions the same way - meaning, what one set of native instruction means on computer A could be something different on computer B.
As a result, a VM is run, with one specific to each computer. This way, the Java bytecode that is written is standardized, and only the interpreter has to work to convert it to machine language.
OS independence is a big part of it but you also get abstractions from other things like CPUs... the same Java code can execute on ARM, x86, whatever without modification so long as there is a JVM in place.

What are logging libraries for?

This may be a stupid question, as most of my programming consists of one-man scientific computing research prototypes and developing relatively low-level libraries. I've never programmed in the large in an enterprise environment before. I've always wondered, what are the main things that logging libraries make substantially easier than just using good old fashioned print statements or file output, simple programming logic and a few global variables to determine how verbosely things get logged? How do you know when a few print statements or some basic file output ain't gonna cut it and you need a real logging library?
Logging helps debug problems especially when you move to production and problems occur on people's machines you can't control. Best laid plans never survive contact with the enemy, and logging helps you track how that battle went when faced with real world data.
Off the shel logging libraries are easy to plug in and play in less than 5 minutes.
Log libraries allow for various levels of logging per statement (FATAL, ERROR, WARN, INFO, DEBUG, etc).
And you can turn up or down logging to get more of less information at runtime.
Highly threaded systems help sort out what thread was doing what. Log libraries can log information about threads, timestamps, that ordinary print statements can't.
Most allow you to turn on only portions of the logging to get more detail. So one system can log debug information, and another can log only fatal errors.
Logging libraries allow you to configure logging through an external file so it's easy to turn on or off in production without having to recompile, deploy, etc.
3rd party libraries usually log so you can control them just like the other portions of your system.
Most libraries allow you to log portions or all of your statements to one or many files based on criteria. So you can log to both the console AND a log file.
Log libraries allow you to rotate logs so it will keep several log files based on many different criteria. Say after the log gets 20MB rotate to another file, and keep 10 log files around so that log data is always 100MB.
Some log statements can be compiled in or out (language dependent).
Log libraries can be extended to add new features.
You'll want to start using a logging libraries when you start wanting some of these features. If you find yourself changing your program to get some of these features you might want to look into a good log library. They are easy to learn, setup, and use and ubiquitous.
There are used in environments where the requirements for logging may change, but the cost of changing or deploying a new executable are high. (Even when you have the source code, adding a one line logging change to a program can be infeasible because of internal bureaucracy.)
The logging libraries provide a framework that the program will use to emit a wide variety of messages. These can be described by source (e.g. the logger object it is first sent to, often corresponding to the class the event has occurred in), severity, etc.
During runtime the actual delivery of the messaages is controlled using an "easily" edited config file. For normal situations most messages may be obscured altogether. But if the situation changes, it is a simpler fix to enable more messages, without needing to deploy a new program.
The above describes the ideal logging framework as I understand the intention; in practice I have used them in Java and Python and in neither case have I found them worth the added complexity. :-(
They're for logging things.
Or more seriously, for saving you having to write it yourself, giving you flexible options on where logs are store (database, event log, text file, CSV, sent to a remote web service, delivered by pixies on a velvet cushion) and on what is logged at runtime, rather than having to redefine a global variable and then recompile.
If you're only writing for yourself then it's unlikely you need one, and it may introduce an external dependency you don't want, but once your libraries start to be used by others then having a logging framework in place may well help your users, and you, track down problems.
I know that a logging library is useful when I have more than one subsystem with "verbose logging," but where I only want to see that verbose data from one of them.
Certainly this can be achieved by having a global log level per subsystem, but for me it's easier to use a "system" of some sort for that.
I generally have a 2D logging environment too; "Info/Warning/Error" (etc) on one axis and "AI/UI/Simulation/Networking" (etc) on the other. With this I can specify the logging level that I care about seeing for each subsystem easily. It's not actually that complicated once it's in place, indeed it's a lot cleaner than having if my_logging_level == DEBUG then print("An error occurred"); Plus, the logging system can stuff file/line info into the messages, and then getting totally fancy you can redirect them to multiple targets pretty easily (file, TTY, debugger, network socket...).

Open alternatives to Windows Workflow

Pre-warning: There are some other questions similar to this but don't quite answer the question (these include: Alternatives to Windows Workflow Foundation?, Can anyone recommend a .Net open source alternative to Windows Workflow?)
We are developing a system that is an event based state machine, currently we are investigating windows workflow, our system needs to be low latency in its response to events from a multitude of sources (xmpp, http, sms, phone call, email etc etc) coming into the system, scalable and resilient and most importantly customisable. For a variety of reasons (and due diligence) I am looking for open workflow engines that support functions similar to Windows Workflow Foundation (and more - if possible), mainly (but it doesn't matter too much if there are engines that don't support some features):
Persistence of long running tasks, and resumption of tasks on external events
High performance, low latency
Ability to develop custom actions
The ability to specify workflows dynamically
Tracking and tracing
I am not constrained to platform or language, and I would love some help and tips from you guys so that I can start to investigate the engines more closely and any experiences you had with the engines.
Paul.
I invite you to examine Stateless further, as suggested in the answer to my SO question can-anyone-recommend-a-net-open-source-alternative-to-windows-workflow. to achieve the goal of a long running state machine is very simple in that you can store the current state of your state in a database and re-sync the state machine when needed. Consider the following code from the stateless site:
Stateless has been designed with
encapsulation within an ORM-ed domain
model in mind. Some ORMs place
requirements upon where mapped data
may be stored. To this end, the
StateMachine constructor can accept
function arguments that will be used
to read and write the state values:
var stateMachine = new StateMachine<State, Trigger>(
() => myState.Value,
s => myState.Value = s);
With very little effort you can persist your state, then retrieve that state easily later on.
In respect updating the workflow dynamically, if you configure a state machine such as
var stateMachine = new StateMachine<string, int>();
and maintain a separate file of states and triggers in XML, you can perform a configuration at runtime by looping through the string int value pairs.
"Java side":
Apache ODE (Orchestration Director Engine) executes business processes written following the WS-BPEL standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.
http://ode.apache.org/
OSWorkflow can be considered a "low level" workflow implementation. Situations like "loops" and "conditions" that might be represented by a graphical icon in other workflow systems must be "coded" in OSWorkflow.
http://www.opensymphony.com/osworkflow/
Shark is an extendable workflow engine framework including a standard implementation completely based on WfMC specifications using XPDL (without any proprietary extensions !) as its native workflow process definition format and the WfMC "ToolAgents" API for serverside execution of system activitie
http://www.enhydra.org/workflow/shark/index.html
Python side:
http://bika.sourceforge.net/
http://www.vivtek.com/wftk/
I this will help you :-)
You might consider implementing your flow as an actual state machine. Tools like State Machine Compiler and Ragel can help with this. State machines, in many circumstances, are just what you need to implement insanely complex behavior that is testable, and rock-solid. I don't claim to be a Windows work flow expert, but from what I have seen, I question its superiority over coding your own state machine, either by hand or using a tool.
You might want to check out Simple State Machine.
If you feel like you want to have more control over things and want to roll your own it might be helpful to check out the Saga support that projects like NServiceBus and MassTransit use. Sagas look to be very similar to WF workflows but are POCO objects and I believe both projects just use NHibernate for Saga persistence.
I'm going to recommend you take a few hours to look at the book Open-Source ESBs in Action. "Orchestration" and "Choreography" are the key buzzwords to look at when dealing with "enterprise service busses." The systems for .NET are quite expensive (BizTalk is in the price range of a decent car, the price of Tibco is in the price range of a decent house).
Other links:
Open ESB project
Comparison of OpenESB and ServiceMix (both of which are the subject of the "In Action" book above.
Try Drools for JAVA, I personally have never tried it but I know several commercial applications are based on drools.
http://www.jboss.org/drools/
You could also upgrade to .NET 4.0 there are major improvements in the Workflow in the new framework. I know if I was writing a new workflow application I would jump to 4.0.
Good Luck
JBoss JBPM
Consider Workflow Engine, a lightweight all-in-one component that enables you to add custom executable workflows of any complexity to any .NET or Java software, be it your own creation or a third-party solution, with minimal changes to existing code. It supports custom actions and commands, has timers and supports parallel workflows. And there's a free version.
You can take a look at Imixs-Workflow, which is an event driven approach of a state machine based on bpmn 2.0. It specially focuses on human-centric long running tasks.

Is it bad practice to use the system() function when library functions could be used instead? Why?

Say there is some functionality needed for an application under development which could be achieved by making a system call to either a command line program or utilizing a library. Assuming efficiency is not an issue, is it bad practice to simply make a system call to a program instead of utilizing a library? What are the disadvantages of doing this?
To make things more concrete, an example of this scenario would be an application which needs to download a file from a web server, either the cURL program or the libcURL library could be used for this.
Unless you are writing code for only one OS, there is no way of knowing if your system call will even work. What happens when there is a system update or an OS upgrade?
Never use a system call if there is a library to do the same function.
I prefer libraries because of the dependency issue, namely the executable might not be there when you call it, but the library will be (assuming external library references get taken care of when the process starts on your platform). In other words, using libraries would seem to guarantee a more stable, predictable outcome in more environments than system calls would.
There are several factors to take into account. One key one is the reliability of whether the external program will be present on all systems where your software is installed. If there is a possibility that it will be missing, then maybe it is better to do it inside your program.
Weighing against that, you might consider that the extra code loaded into your program is prohibitive - you don't need the code bloat for such a seldom-used part of your application.
The system() function is convenient, but dangerous, not least because it invokes a shell, usually. You may be better off calling the program more directly - on Unix, via the fork() and exec() system calls. [Note that a system call is very different from calling the system() function, incidentally!] OTOH, you may need to worry about ensuring all open file descriptors in your program are closed - especially if your program is some sort of daemon running on behalf of other users; that is less of a problem if your are not using special privileges, but it is still a good idea not to give the invoked program access to anything you did not intend. You may need to look at the fcntl() system call and the FD_CLOEXEC flag.
Generally, it is easier to keep control of things if you build the functionality into your program, but it is not a trivial decision.
Security is one concern. A malicious cURL could cause havoc in your program. It depends if this is a personal program where coding speed is your main focus, or a commercial application where things like security play a factor.
System calls are much harder to make safely.
All sorts of funny characters need to be correctly encoded to pass arguments in, and the types of encoding may vary by platform or even version of the command. So making a system call that contains any user data at all requires a lot of sanity-checking and it's easy to make a mistake.
Yeah, as mentioned above, keep in mind the difference between system calls (like fcntl() and open()) and system() calls. :)
In the early stages of prototyping a c program, I often make external calls to programs like grep and sed for manipulation of files using popen(). It's not safe, it's not secure, and it's certainly not portable. But it can allow you to get going quickly. That's valuable to me. It lets me focus on the really important core of the program, usually the reason I used c in the first place.
In high level languages, you'd better have a pretty good reason. :)
Instead of doing either, I'd Unix it up and build a script framework around your app, using the command line arguments and stdin.
Other's have mentioned good points (reliability, security, safety, portability, etc) - but I'll throw out another. Performance. Generally it is many times faster to call a library function or even spawn a new thread then it is to start an entire new process (and then you still have to correctly check/verify it's execution and parse it's output!)

What's the difference between a "script" and an "application"?

I'm referring to distinctions such as in this answer:
...bash isn't for writing applications it's for, well, scripting. So sure, your application might have some housekeeping scripts but don't go writing critical-business-logic.sh because another language is probably better for stuff like that.
As programmer who's worked in many languages, this seems to be C, Java and other compiled language snobbery. I'm not looking for reenforcement of my opinion or hand-wavy answers. Rather, I genuinely want to know what technical differences are being referred to.
(And I use C in my day job, so I'm not just being defensive.)
Traditionally a program is compiled and a script is interpreted, but that is not really important anymore. You can generate a compiled version of most scripts if you really want to, and other 'compiled' languages like Java are in fact interpreted (at the byte code level.)
A more modern definition might be that a program is intended to be used by a customer (perhaps an internal one) and thus should include documentation and support, while a script is primarily intended for the use of the author.
The web is an interesting counter example. We all enjoy looking things up with the Google search engine. The bulk of the code that goes into creating the 'database' it references is used only by its authors and maintainers. Does that make it a script?
I would say that an application tends to be used interactively, where a script would run its course, suitable for batch work. I don't think it's a concrete distinction.
Usually, it is "script" versus "program".
I am with you that this distinction is mostly "compiled language snobbery", or to quote Larry Wall and take the other side of the fence, "a script is what the actors have, a programme is given to the audience".
This is an interesting topic, and I don't think there are very good guidelines for the differentiating a "script" and a "application."
Let's take a look at some Wikipedia articles to get a feel of the distinction.
Script (Wikipedia -> Scripting language):
A scripting language, script language or extension language, is a programming language that controls a software application. "Scripts" are often treated as distinct from "programs", which execute independently from any other application. At the same time they are distinct from the core code of the application, which is usually written in a different language, and by being accessible to the end user they enable the behavior of the application to be adapted to the user's needs.
Application (Wikipedia -> Application software -> Terminology)
In computer science, an application is a computer program designed to help people perform a certain type of work. An application thus differs from an operating system (which runs a computer), a utility (which performs maintenance or general-purpose chores), and a programming language (with which computer programs are created). Depending on the work for which it was designed, an application can manipulate text, numbers, graphics, or a combination of these elements.
Reading the above entries seems to suggest that the distinction is that a script is "hosted" by another piece of software, while an application is not. I suppose that can be argued, such as shell scripts controlling the behavior of the shell, and perl scripts controlling the behavior of the interpreter to perform desired operations. (I feel this may be a little bit of a stretch, so I may not completely agree with it.)
When it comes down to it, it is in my opinion that the colloquial distinction can be made in terms of the scale of the program. Scripts are generally smaller in scale when compared to applications.
Also, in terms of the purpose, a script generally performs tasks that needs taken care of, say for example, build scripts that produce multiple release versions for a certain piece of software. On the otherhand, applications are geared toward providing functionality that is more refined and geared toward an end user. For example, Notepad or Firefox.
John Ousterhout (the inventor of TCL) has a good article at http://www.tcl.tk/doc/scripting.html where he proposes a distinction between system programming languages (for implementing building blocks, emphasis on correctness, type safety) vs scripting languages (for combining building blocks, emphasis on responsiveness to changing environments and requirements, easy conversion in and out of textual representations). If you go with that categorisation system, then 99% of programmers are doing jobs that are more appropriate to scripting languages than to system programming languages.
A script tends to be a series of commands that starts, runs, and terminates. It often requires no/little human interaction. An application is a "program"... it often requires human interaction, it tends to be larger.
Script to me implies line-by-line interpretation of the code. You can open a script and view its programmer-readable contents. An application implies a stand-alone compiled executable.
It's often just a semantic argument, or even a way of denigrating certain programming languages. As far as I'm concerned, a "script" is a type of program, and the exact definition is somewhat vague and varies with context.
I might use the term "script" to mean a program that primarily executes linearly, rather than with lots of sequential logic or subroutines, much like a "script" in Hollywood is a linear sequence of instructions for an actor to execute. I might use it to mean a program that is written in a language embedded inside a larger program, for the purpose of driving that program. For example, automating tasks under the old Mac OS with AppleScript, or driving a program that exposes itself in some way with an embedded TCL interface.
But in all those cases, a script is a type of program.
The term "scripting language" has been used for dynamically interpreted (sometimes compiled) languages, usually these have a lot of common features such as very high level instructions, built in hashes and arbitrary-length lists and other high level data structures, etc. But those languages are capable of very large, complicated, modular, well-designed programs, so if you think of a "script" as something other than a program, that term might confuse you.
See also Is it a Perl program or a Perl script? in perlfaq1.
A script generally runs as part of a larger application inside a scripting engine
eg. JavaScript -> Browser
This is in contrast to both traditional static typed compiled languages and to dynamic languages, where the code is intended to form the main part of the application.
An application is a collection of scripts geared toward a common set of problems.
A script is a bit of code for performing one fairly specific task.
IMO, the difference has nothing whatsoever to do with the language that's used. It's possible to write a complex application with bash, and it's possible to write a simple script with C++.
Personally, I think the separation is a step back from the actual implementation.
In my estimation, an application is planned. It has multiple goals, it has multiple deliverables. There are tasks set aside at design time in advance of coding that the application must meet.
A script however, is just thrown together as suits, and little planning is involved.
Lack of proper planning does not however downgrade you to a script. Possibly, it makes your application a poorly organized collection of poorly planned scripts.
Further more, an application can contain scripts that aggregated comprise the whole. But a script can only reference an application.
Taking perl as an example, you can write perl scripts or perl applications.
A script would imply a single file or a single namespace. (e.g. updateFile.pl).
An application would be something made up of a collection of files or namespaces/classes (e.g. an OO-designed perl application with many .pm module files).
An application is big and will be used over and over by people and maybe sold to a customer.
A script starts out small, stays small if you're lucky, is rarely sold to a customer, and might either be run automatically or fall into disuse.
What about:
Script:
A script is text file (or collection of text files) of programming statements written in a language which allows individual statements written in it to be interpreted to machine executable code directly before each is executed and with the intention of this occurring.
Application:
An application is any computer program whose primary functionality involves providing service to a human Actor.
A script-based program written in a scripting language can therefore, theoretically, have its textual statements altered while the script is being executed (at great risk of , of course). The analogous situation for compiled programs is flipping bits in memory.
Any takers? :)
First of all, I would like to make it crystal clear that a script is a program. In other words, a script is a set of instructions.
Program:
A set of instructions which is going to be compiled is known as a Program.
Script:
A set of instructions which is going to be interpreted is known as a Script.
#Jeff's answer is good. My favorite explanation is
Many (most?) scripting languages are interpreted, and few compiled
languages are considered to be scripting languages, but the question
of compiled vs. interpreted is only loosely connected to the question
of "scripting" vs. "serious" languages.
A lot of the problem here is that "scripting" is a pretty vague
designation -- it means a language that's convenient for writing
scripts in, as opposed to writing "full-blown programs" (or
applications). But how does one distinguish a complex script from a
simple application? That's an essentially unanswerable question.
Generally, a script is a series of commands applied to some set of
data, possibly in a user-defined order... but then, one could stretch
that description to apply to Photoshop, which is clearly a major
application. Scripts are generally smaller than applications, do
some well-defined thing and are "simpler" to use, and typically can
be decomposed into a clear series of sub-operations, but all of these
things are subjective.
Referenced from here.
I think that there is no matter at all whether code is compiled or interpreted.
The true difference is in core logic of code:
If code makes new functionality that is not implemented in other programs in system - it's a program. It even can be manipulated by a script.
If code is MAINLY manipulates by actions of other programs and total result is MAINLY the results of work of manipulated programs - it's a script. Literally a script of actions for some programs.
Actually the difference between a script ( or a scripting language) and an application is that a script don't require it to be compiled into machine language.. You run the source of the script with an interpreter.. A application compiles the source into machine code so that you can run it as a stand alone application.
I would say a script is usually a set of commands or instructions written in plain text that are executed by a hosting application (browser, command interpreter or shell,...).
It does not mean it's not powerfull or not compiled in some way when it's actually executed. But a script cannot do anything by itself, it's just plain text.
By nature it can be a fragment only, needing to be combined to build a program or an application, but extended and fully developed scripts or set of scripts can be considered programs or applications when executed by the host, just like a bunch of source files can become an application once compiled.
A scripting language doesn't have a standard library or platform (or not much of one). It's small and light, designed to be embedded into a larger application. Bash and Javascript are great examples of scripting languages because they rely absolutely on other programs for their functionality.
Using this definition, a script is code designed to drive a larger application (suite). A Javascript might call on Firefox to open windows or manipulate the DOM. A Bash script executes existing programs or other scripts and connects them together with pipes.
You also ask why not scripting languages, so:
Are there even any unit-testing tools for scripting languages? That seems a very important tool for "real" applications that is completely missing. And there's rarely any real library bindings for scripting languages.
Most of the times, scripts could be replaced with a real, light language like Python or Ruby anyway.