What is quine meant for? - language-agnostic

So I just came across the term Quine on Wikipedia and cannot figure out what the heck it is meant for. I'm more than confused about it. Are there any real-world uses for it?

Quine is essentially a command that outputs its own source code. And no there really aren't any practical uses for it.

A quine is useful in the following scenarios:
Programmatic compilation
Object code may be serialized to disk directly, though it has a cookie and version prepended to the front. But when compiling Scheme at run time, you want a Scheme value: for example, a compiled procedure. For this reason, so as not to break the abstraction, Guile defines a fake language at the bottom of the tower:
Value
Compiling to value loads the object code into a procedure, and wakes the sleeping giant.
Perhaps this strangeness can be explained by example: compile-file defaults to compiling to object code, because it produces object code that has to live in the barren world outside the Guile runtime; but compile defaults to compiling to value, as its product re-enters the Guile world.
Indeed, the process of compilation can circulate through these different worlds indefinitely, as shown by the following quine:
((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
Security checking:
In the Omniture case an attacker needs to add the quine in the cookie and then force the eval using a new value for the cid
Using a crafted gzip file (the attached file is a quine that unpacks to itself), one can get tar(1) to invoke an infinite chain of gzip compressors until all the memory on the machine running tar(1) has been exhausted or another resource limit kicks in.
Self Transformation
Essentially, its intuitive (and effective) content is that a program may use its own source as a variable, i.e. adding to a programming language the ability for a program to manipulate itself (its source code) does not add to its expressive power. So there exists a program that compresses its own listing; there exists one which prints its own MD5 checksum (this is much easier than finding a program — indeed any file — that contains its MD5 checksum
Self-Contained Two-Tier Architecture
If your data and the database code are not stored in the same place, you risk losing track of one, making the other useless.
Self-Preservation
TiddlyWiki is an unusual example of a practical quine: it is this ability to produce a copy of its own source code that lies at the heart of TiddlyWiki's ability to independently save changes to itself.
Proofs
If "quine" means "an automaton using its own source code as input", then Gödel used a quine, more or less, to prove his incompleteness theorem, Turing used one to prove that the halting problem was uncomputable, Thompson used one to show that access to the source code of all of your software and compilers is insufficient to find backdoors in it, Steve Russell approximately invented functional programming languages by encoding McCarthy's Lisp quine into (709?) assembly, John von Neumann predicted that the structure of self-reproducing life forms in general would turn out to be quines, and was proved right with the discovery of DNA.
So if quines are responsible for the existence of life, Gödel's incompleteness theorem, the proof of the uncomputability of the halting problem, and the existence of functional programming languages, I nominate them as the most historically important category of programs of all.
Unstructured loops
Muriel has no traditional control structures. Instead, Muriel has a command to replace the currently running Muriel program with a given string, and run that instead. This leads to a programming method where a program must quine itself in order to perform any sort of loop.
References
Guile Reference Manual: Compiler Tower
Minded Security Blog: God Save The (Omniture) Quine
Irresistible programs | Lambda the Ultimate: Quines are the most historically important programs of all!
TiddlyWiki: Quine
libarchive Issue #660: Possible denial of service using a crafted gzip file
QuineDB
Quines (self-replicating programs)
Quine - Rosetta Code
Comparison of Two Population Proportions | R Tutorial
Muriel - Esolang
A quine in pure lambda calculus - Computer Science Stack Exchange

No, it's not an useful thing, it's just an exercise of style, that some programmers enjoy...

Related

What features of interpreted languages can a compiled one not have?

Interpreted languages are usually more high-level and therefore have features as dynamic typing (including creating new variables dynamically without declaration), the infamous eval and many many other features that make a programmer's life easier - but why can't compiled languages have these as well?
I don't mean languages like Java that run on a VM, but those that compile to binary like C(++).
I'm not going to make a list now but if you are going to ask which features I mean, please look into what PHP, Python, Ruby etc. have to offer.
Which common features of interpreted languages can't/don't/do exist in compiled languages? Why?
Whether source code is compiled - to native binaries, some kind of intermediate language (Java Bytecode/IL) - or interpreted is absolutely no trait of the language. It's just a question of the implementation.
You can actually have both compilers and interpreters for the same language like
Haskell: GHC <-> GHCI
C: gcc <-> ch
VB6: VS IDE <-> VB6 compiler
Certain language features like eval or dynamic typing may suggest a distinction between so called "dynamic languages" and static ones, but how this is run can never be the primary question.
Initially, one of the largest benefits of interpreted languages was debugging. That way you can get incredibly accurate and detailed information when looking for the reason a program isn't working. However, most compilers have become advanced enough that that is not too big of a deal any more.
The other main benefit (in my opinion anyway), is that with interpreted languages, you don't have to wait for eternity for your project to compile to test it out.
You couldn't plausibly do eval, for example, for reasons I'd have thought were pretty obvious: exactly how would you implement it? Make the runtime contain a full copy of the compiler? Every time you wanted to evaluate a string (keeping in mind that each time it could be different!) you'd save the string to a file, run the compiler on it to make a DLL/shared-lib, then load that DLL/shared-lib and call your code? You can't see why this might be a wee bit impractical? ;)
You can find this kind of thing in dynamic languages all over the place that you can't do with static code short of basically running an interpreter, in effect, behind the scenes.
Continuing on from Dario - I think you are really asking why a compiled program can't evaluate statements at runtime (e.g. eval). Here's some reasons I can think of:
The full compiler would have to be distributed with the program (or be part of the program)
For an eval function to have access to type information and symbols (such as variable names and function names) in the environment it was used the original program would have to be compiled with those symbols accessible (compiled languages usually remove these symbols at compile time).
Edit: As noted neither of these reasons make it impossible for a language/compiler to be able to evaluate code at runtime, but they are definitely things that need to be taken into consideration when developing a compiler or when designing a language.
Maybe the question is not about interpreted/compiled languages (compile is ambiguous anyway) but about languages that do/don't carry their own compiler around with them? For instance we've said C++ could do eval with a handy compiler floating around in the app, and reflection presumably is similar in some ways.

What is instrumentation?

I've heard this term used a lot in the same context as logging, but I can't seem to find a clear definition of what it actually is.
Is it simply a more general class of logging/monitoring tools and activities?
Please provide sample code/scenarios when/how instrumentation should be used.
I write tools that perform instrumentation. So here is what I think it is.
DLL rewriting. This is what tools like Purify and Quantify do. A previous reply to this question said that they instrument post-compile/link. That is not correct. Purify and Quantify instrument the DLL the first time it is executed after a compile/link cycle, then cache the result so that it can be used more quickly next time around. For large applications, profiling the DLLs can be very time consuming. It is also problematic - at a company I worked at between 1998-2000 we had a large 2 million line app that would take 4 hours to instrument, and 2 of the DLLs would randomly crash during instrumentation and if either failed you would have do delete both of them, then start over.
In place instrumentation. This is similar to DLL rewriting, except that the DLL is not modified and the image on the disk remains untouched. The DLL functions are hooked appropriately to the task required when the DLL is first loaded (either during startup or after a call to LoadLibrary(Ex). You can see techniques similar to this in the Microsoft Detours library.
On-the-fly instrumentation. Similar to in-place but only actually instruments a method the first time the method is executed. This is more complex than in-place and delays the instrumentation penalty until the first time the method is encountered. Depending on what you are doing, that could be a good thing or a bad thing.
Intermediate language instrumentation. This is what is often done with Java and .Net languages (C~, VB.Net, F#, etc). The language is compiled to an intermediate language which is then executed by a virtual machine. The virtual machine provides an interface (JVMTI for Java, ICorProfiler(2) for .Net) which allows you to monitor what the virtual machine is doing. Some of these options allow you to modify the intermediate language just before it gets compiled to executable instructions.
Intermediate language instrumentation via reflection. Java and .Net both provide reflection APIs that allow the discovery of metadata about methods. Using this data you can create new methods on the fly and instrument existing methods just as with the previously mentioned Intermediate language instrumentation.
Compile time instrumentation. This technique is used at compile time to insert appropriate instructions into the application during compilation. Not often used, a profiling feature of Visual Studio provides this feature. Requires a full rebuild and link.
Source code instrumentation. This technique is used to modify source code to insert appropriate code (usually conditionally compiled so you can turn it off).
Link time instrumentation. This technique is only really useful for replacing the default memory allocators with tracing allocators. An early example of this was the Sentinel memory leak detector on Solaris/HP in the early 1990s.
The various in-place and on-the-fly instrumentation methods are fraught with danger as it is very hard to stop all threads safely and modify the code without running the risk of requiring an API call that may want to access a lock which is held by a thread you've just paused - you don't want to do that, you'll get a deadlock. You also have to check if any of the other threads are executing that method, because if they are you can't modify it.
The virtual machine based instrumentation methods are much easier to use as the virtual machine guarantees that you can safely modify the code at that point.
(EDIT - this item added later) IAT hooking instrumentation. This involved modifying the import addess table for functions linked against in other DLLs/Shared Libraries. This type of instrumentation is probably the simplest method to get working, you do not need to know how to disassemble and modify existing binaries, or do the same with virtual machine opcodes. You just patch the import table with your own function address and call the real function from your hook. Used in many commercial and open source tools.
I think I've covered them all, hope that helps.
instrumentation is usually used in dynamic code analysis.
it differs from logging as instrumentation is usually done automatically by software, while logging needs human intelligence to insert the logging code.
It's a general term for doing something to your code necessary for some further analysis.
Especially for languages like C or C++, there are tools like Purify or Quantify that profile memory usage, performance statistics, and the like. To make those profiling programs work correctly, an "instrumenting" step is necessary to insert the counters, array-boundary checks, etc that is used by the profiling programs. Note that in the Purify/Quantify scenario, the instrumentation is done automatically as a post-compilation step (actually, it's an added step to the linking process) and you don't touch your source code.
Some of that is less necessary with dynamic or VM code (i.e. profiling tools like OptimizeIt are available for Java that does a lot of what Quantify does, but no special linking is required) but that doesn't negate the concept.
A excerpt from wikipedia article
In context of computer programming,instrumentation refers to an
ability to monitor or measure the level of a product's performance, to
diagnose errors and to write trace information. Programmers implement
instrumentation in the form of code instructions that monitor specific
components in a system (for example, instructions may output logging
information to appear on screen). When an application contains
instrumentation code, it can be managed using a management tool.
Instrumentation is necessary to review the performance of the
application. Instrumentation approaches can be of two types, source
instrumentation and binary instrumentation.
Whatever Wikipedia says, there is no standard / widely agreed definition for code instrumentation in IT industry.
Please consider, instrumentation is a noun derived from instrument which has very broad meaning.
"Code" is also everything in IT, I mean - data, services, everything.
Hence, code instrumentation is a set of applications that is so wide ... not worth giving it a separate name ;-).
That's probably why this Wikipedia article is only a stub.

What language features are required in a programming language to make a compiler?

Programming languages seem to go through several stages. Firstly, someone dreams up a new language, Foo Language. The compiler/interpreter is written in another language, usually C or some other low level language. At some point, FooL matures and grows, and eventually someone, somewhere will write a compiler and/or interpreter for FooL in FooL itself.
My question is this: What is the minimal subset of language features such that someone could implement that language in itself?
Compiler can be written even using a Turing machine - a Universal Turing Machine is basically a compiler/interpreter of any Turing machine, so any Turing-complete language should be enough :)
In theory, surprisingly little. A computability theorist would say that all you need is mu-recursion or a Turing machine or the like.
However, from a practical point of view, you're not going to be very happy trying to implement a programming language in a Turing machine. I would say that, at a minimum, you would want to have all the usual control-flow constructs, the primitive datatypes, subroutines, as well as arrays and structs. That should be enough to let you implement that subset of the language in the language itself -- and you can then bootstrap yourself up from there.
One option is a read-eval-print loop. This can be used to build many higher-level constructs. I believe this is the path taken by LISP.
I am unsure about the beginnings of C, but I think it started with a few system calls to implement branching, loops, assignment and single-character I/O, and built from there.
Id assume a assembler would make the cut.
My question is this: What is the minimal subset of language features such that someone could implement that language in itself?
There is no requirement for the language to be useful for anything other than compiling itself? I present to you Useless, the language in which every text is a proper program and means "a program that takes any input and produces itself" (this is also known as Useless compiler).

runnable pseudocode?

I am attempting to determine prior art for the following idea:
1) user types in some code in a language called (insert_name_here);
2) user chooses a destination language from a list of well-known output candidates (javascript, ruby, perl, python);
3) the processor translates insert_name_here into runnable code in destination language;
4) the processor then runs the code using the relevant system call based on the chosen language
The reason this works is because there is a pre-established 1 to 1 mapping between all language constructs from insert_name_here to all supported destination languages.
(Disclaimer: This obviously does not produce "elegant" code that is well-tailored to the destination language. It simply does a rudimentary translation that is runnable. The purpose is to allow developers to get a quick-and-dirty implementation of algorithms in several different languages for those cases where they do not feel like re-inventing the wheel, but are required for whatever reason to work with a specific language on a specific project.)
Does this already exist?
The .NET CLR is designed such that C++.Net, C#.Net, and VB.Net all compile to the same machine language, and you can "decompile" that CLI back in to any one of those languages.
So yes, I would say it already exists though not exactly as you describe.
There are converters available for different languages. The problem you are going to have is dealing with libraries. While mapping between language statements might be easy, finding mappings between library functions will be very difficult.
I'm not really sure how useful that type of code generator would be. Why would you want to write something in one language and then immediately convert it to something else? I can see the rationale for 4th Gen languages that convert diagrams or models into code but I don't really see the point of your effort.
Yes, a program that transform a program from one representation to another does exist. It's called a "compiler".
And as to your question whether that is always possible: as long as your target language is at least as powerful as the source language, then it is possible. So, if your target language is Turing-complete, then it is always possible, because there can be no language that is more powerful than a Turing-complete language.
However, there does not need to be a dumb 1:1 mapping.
For example: the Microsoft Volta compiler which compiles CIL bytecode to JavaScript sourcecode has a problem: .NET has threads, JavaScript doesn't. But you can implement threads with continuations. Well, JavaScript doesn't have continuations either, but you can implement continuations with exceptions. So, Volta transforms the CIL to CPS and then implements CPS with exceptions. (Newer versions of JavaScript have semi-coroutines in the form of generators; those could also be used, but Volta is intended to work across a wide range of JavaScript versions, including obviously JScript in Internet Explorer.)
This seems a little bizarre. If you're using the term "prior art" in its most common form, you're discussing a potentially patentable idea. If that is the case, you have:
1/ Published the idea, starting the clock running on patent filing - I'm assuming, perhaps incorrectly, that you're based in the U.S. Other jurisdictions may have other rules.
2/ Told the entire planet your idea, which means it's pretty much useless to try and patent it, unless you act very fast.
If you're not thinking about patenting this and were just using the term "prior art" in a laypersons sense, I apologize. I work for a company that takes patents very seriously and it's drilled into us, in great detail, what we're allowed to do with information before filing.
Having said that, patentable ideas must be novel, useful and non-obvious. I would think that your idea would not pass on the third of these since you're describing a language translator which would have the prior art of the many pascal-to-c and fortran-to-c converters out there.
The one glimmer of hope would be the ability of your idea to generate one of multiple output languages (which p2c and f2c don't do) but I think even that would be covered by the likes of cross compilers (such as gcc) which turn source into one of many different object languages.
IBM has a product called Visual Age Generator in which you code in one (proprietary) language and it's converted into COBOL/C/Java/others to run on different target platforms from PCs to the big honkin' System z mainframes, so there's your first problem (thinking about patenting an idea that IBM, the biggest patenter in the world, is already using).
Tons of them. p2c, f2c, and the original implementation s of C++ and Objective C strike me immediately. Beyond that, it's kind of hard to distinguish what you're describing from any compiler, especially for us old guys whose compilers generated ASM code for an intermediate represetation anyway.

What's the difference between a "script" and an "application"?

I'm referring to distinctions such as in this answer:
...bash isn't for writing applications it's for, well, scripting. So sure, your application might have some housekeeping scripts but don't go writing critical-business-logic.sh because another language is probably better for stuff like that.
As programmer who's worked in many languages, this seems to be C, Java and other compiled language snobbery. I'm not looking for reenforcement of my opinion or hand-wavy answers. Rather, I genuinely want to know what technical differences are being referred to.
(And I use C in my day job, so I'm not just being defensive.)
Traditionally a program is compiled and a script is interpreted, but that is not really important anymore. You can generate a compiled version of most scripts if you really want to, and other 'compiled' languages like Java are in fact interpreted (at the byte code level.)
A more modern definition might be that a program is intended to be used by a customer (perhaps an internal one) and thus should include documentation and support, while a script is primarily intended for the use of the author.
The web is an interesting counter example. We all enjoy looking things up with the Google search engine. The bulk of the code that goes into creating the 'database' it references is used only by its authors and maintainers. Does that make it a script?
I would say that an application tends to be used interactively, where a script would run its course, suitable for batch work. I don't think it's a concrete distinction.
Usually, it is "script" versus "program".
I am with you that this distinction is mostly "compiled language snobbery", or to quote Larry Wall and take the other side of the fence, "a script is what the actors have, a programme is given to the audience".
This is an interesting topic, and I don't think there are very good guidelines for the differentiating a "script" and a "application."
Let's take a look at some Wikipedia articles to get a feel of the distinction.
Script (Wikipedia -> Scripting language):
A scripting language, script language or extension language, is a programming language that controls a software application. "Scripts" are often treated as distinct from "programs", which execute independently from any other application. At the same time they are distinct from the core code of the application, which is usually written in a different language, and by being accessible to the end user they enable the behavior of the application to be adapted to the user's needs.
Application (Wikipedia -> Application software -> Terminology)
In computer science, an application is a computer program designed to help people perform a certain type of work. An application thus differs from an operating system (which runs a computer), a utility (which performs maintenance or general-purpose chores), and a programming language (with which computer programs are created). Depending on the work for which it was designed, an application can manipulate text, numbers, graphics, or a combination of these elements.
Reading the above entries seems to suggest that the distinction is that a script is "hosted" by another piece of software, while an application is not. I suppose that can be argued, such as shell scripts controlling the behavior of the shell, and perl scripts controlling the behavior of the interpreter to perform desired operations. (I feel this may be a little bit of a stretch, so I may not completely agree with it.)
When it comes down to it, it is in my opinion that the colloquial distinction can be made in terms of the scale of the program. Scripts are generally smaller in scale when compared to applications.
Also, in terms of the purpose, a script generally performs tasks that needs taken care of, say for example, build scripts that produce multiple release versions for a certain piece of software. On the otherhand, applications are geared toward providing functionality that is more refined and geared toward an end user. For example, Notepad or Firefox.
John Ousterhout (the inventor of TCL) has a good article at http://www.tcl.tk/doc/scripting.html where he proposes a distinction between system programming languages (for implementing building blocks, emphasis on correctness, type safety) vs scripting languages (for combining building blocks, emphasis on responsiveness to changing environments and requirements, easy conversion in and out of textual representations). If you go with that categorisation system, then 99% of programmers are doing jobs that are more appropriate to scripting languages than to system programming languages.
A script tends to be a series of commands that starts, runs, and terminates. It often requires no/little human interaction. An application is a "program"... it often requires human interaction, it tends to be larger.
Script to me implies line-by-line interpretation of the code. You can open a script and view its programmer-readable contents. An application implies a stand-alone compiled executable.
It's often just a semantic argument, or even a way of denigrating certain programming languages. As far as I'm concerned, a "script" is a type of program, and the exact definition is somewhat vague and varies with context.
I might use the term "script" to mean a program that primarily executes linearly, rather than with lots of sequential logic or subroutines, much like a "script" in Hollywood is a linear sequence of instructions for an actor to execute. I might use it to mean a program that is written in a language embedded inside a larger program, for the purpose of driving that program. For example, automating tasks under the old Mac OS with AppleScript, or driving a program that exposes itself in some way with an embedded TCL interface.
But in all those cases, a script is a type of program.
The term "scripting language" has been used for dynamically interpreted (sometimes compiled) languages, usually these have a lot of common features such as very high level instructions, built in hashes and arbitrary-length lists and other high level data structures, etc. But those languages are capable of very large, complicated, modular, well-designed programs, so if you think of a "script" as something other than a program, that term might confuse you.
See also Is it a Perl program or a Perl script? in perlfaq1.
A script generally runs as part of a larger application inside a scripting engine
eg. JavaScript -> Browser
This is in contrast to both traditional static typed compiled languages and to dynamic languages, where the code is intended to form the main part of the application.
An application is a collection of scripts geared toward a common set of problems.
A script is a bit of code for performing one fairly specific task.
IMO, the difference has nothing whatsoever to do with the language that's used. It's possible to write a complex application with bash, and it's possible to write a simple script with C++.
Personally, I think the separation is a step back from the actual implementation.
In my estimation, an application is planned. It has multiple goals, it has multiple deliverables. There are tasks set aside at design time in advance of coding that the application must meet.
A script however, is just thrown together as suits, and little planning is involved.
Lack of proper planning does not however downgrade you to a script. Possibly, it makes your application a poorly organized collection of poorly planned scripts.
Further more, an application can contain scripts that aggregated comprise the whole. But a script can only reference an application.
Taking perl as an example, you can write perl scripts or perl applications.
A script would imply a single file or a single namespace. (e.g. updateFile.pl).
An application would be something made up of a collection of files or namespaces/classes (e.g. an OO-designed perl application with many .pm module files).
An application is big and will be used over and over by people and maybe sold to a customer.
A script starts out small, stays small if you're lucky, is rarely sold to a customer, and might either be run automatically or fall into disuse.
What about:
Script:
A script is text file (or collection of text files) of programming statements written in a language which allows individual statements written in it to be interpreted to machine executable code directly before each is executed and with the intention of this occurring.
Application:
An application is any computer program whose primary functionality involves providing service to a human Actor.
A script-based program written in a scripting language can therefore, theoretically, have its textual statements altered while the script is being executed (at great risk of , of course). The analogous situation for compiled programs is flipping bits in memory.
Any takers? :)
First of all, I would like to make it crystal clear that a script is a program. In other words, a script is a set of instructions.
Program:
A set of instructions which is going to be compiled is known as a Program.
Script:
A set of instructions which is going to be interpreted is known as a Script.
#Jeff's answer is good. My favorite explanation is
Many (most?) scripting languages are interpreted, and few compiled
languages are considered to be scripting languages, but the question
of compiled vs. interpreted is only loosely connected to the question
of "scripting" vs. "serious" languages.
A lot of the problem here is that "scripting" is a pretty vague
designation -- it means a language that's convenient for writing
scripts in, as opposed to writing "full-blown programs" (or
applications). But how does one distinguish a complex script from a
simple application? That's an essentially unanswerable question.
Generally, a script is a series of commands applied to some set of
data, possibly in a user-defined order... but then, one could stretch
that description to apply to Photoshop, which is clearly a major
application. Scripts are generally smaller than applications, do
some well-defined thing and are "simpler" to use, and typically can
be decomposed into a clear series of sub-operations, but all of these
things are subjective.
Referenced from here.
I think that there is no matter at all whether code is compiled or interpreted.
The true difference is in core logic of code:
If code makes new functionality that is not implemented in other programs in system - it's a program. It even can be manipulated by a script.
If code is MAINLY manipulates by actions of other programs and total result is MAINLY the results of work of manipulated programs - it's a script. Literally a script of actions for some programs.
Actually the difference between a script ( or a scripting language) and an application is that a script don't require it to be compiled into machine language.. You run the source of the script with an interpreter.. A application compiles the source into machine code so that you can run it as a stand alone application.
I would say a script is usually a set of commands or instructions written in plain text that are executed by a hosting application (browser, command interpreter or shell,...).
It does not mean it's not powerfull or not compiled in some way when it's actually executed. But a script cannot do anything by itself, it's just plain text.
By nature it can be a fragment only, needing to be combined to build a program or an application, but extended and fully developed scripts or set of scripts can be considered programs or applications when executed by the host, just like a bunch of source files can become an application once compiled.
A scripting language doesn't have a standard library or platform (or not much of one). It's small and light, designed to be embedded into a larger application. Bash and Javascript are great examples of scripting languages because they rely absolutely on other programs for their functionality.
Using this definition, a script is code designed to drive a larger application (suite). A Javascript might call on Firefox to open windows or manipulate the DOM. A Bash script executes existing programs or other scripts and connects them together with pipes.
You also ask why not scripting languages, so:
Are there even any unit-testing tools for scripting languages? That seems a very important tool for "real" applications that is completely missing. And there's rarely any real library bindings for scripting languages.
Most of the times, scripts could be replaced with a real, light language like Python or Ruby anyway.