Simulator or Emulator? What is the difference? - terminology

While I understand what simulation and emulation mean in general, I almost always get confused about them. Assume that I create a piece of software that mimics existing hardware/software, what should I call it? A simulator or an emulator?
Could anyone explain the difference in terms of programming?
Bonus: What is the difference in English between these two terms? (Sorry, I am not a native speaker :))

Emulation is the process of mimicking the outwardly observable behavior to match an existing target. The internal state of the emulation mechanism does not have to accurately reflect the internal state of the target which it is emulating.
Simulation, on the other hand, involves modeling the underlying state of the target. The end result of a good simulation is that the simulation model will emulate the target which it is simulating.
Ideally, you should be able to look into the simulation and observe properties that you would also see if you looked into the original target. In practice, there may some shortcuts to the simulation for performance reasons -- that is, some internal aspects of the simulation may actually be an emulation.
MAME is an arcade game emulator; Hyperterm is a (not very good) terminal emulator. There's no need to model the arcade machine or a terminal in detail to get the desired emulated behavior.
Flight Simulator is a simulator; SPICE is an electronics simulator. They model as much as possible every detail of the target to represent what the target does in reality.
EDIT: Other responses have pointed out that the goal of an emulation is to able to substitute for the object it is emulating. That's an important point. A simulation's focus is more on the modeling of the internal state of the target -- and the simulation does not necessarily lead to emulation. In particular, a simulation may run far slower than real-time. SPICE, for example, cannot substitute for an actual electronics circuit (even if assuming there was some kind of magical device that perfectly interfaces electrical circuits to a SPICE simulation.)
A simulation does not always lead to emulation --

If a flight-simulator could transport you from A to B then it would be a flight-emulator.
An emulator can replace the original for real use.
A Virtual PC emulates a PC.
A simulator is a model for study and analysis.
An emulator will always have to operate close to real-time. For a simulator that is not always the case. A geological simulation could do 1000 years/second or more.

Simulation = For analysis and study
Emulation = For usage as a substitute
A simulator is an environment which models but an emulator is one that replicates the usage as on the original device or system.
Simulator mimics the activity of something that it is simulating. It "appears"(a lot can go with this "appears", depending on the context) to be the same as the thing being simulated. For example the flight simulator "appears" to be a real flight to the user, although it does not transport you from one place to another.
Emulator, on the other hand, actually "does" what the thing being emulated does, and in doing so it too "appears to be doing the same thing". An emulator may use different set of protocols for mimicking the thing being emulated, but the result/outcome is always the same as the original object. For example, EMU8086 emulates the 8086 microprocessor on your computer, which obviously is not running on 8086 (= different protocols), but the output it gives is what a real 8086 would give.

It's a difference in focus. Emulators1 focus on recreating the behavior of a system, with no regard for how the system functions internally. Simulators2 focus on modeling the components of a system. You use an emulator when you care mostly about what a system does, and a simulator when you care about how it does it.
As for their general English meanings, emulation is "the endeavor to equal or to excel another in qualities or actions", while simulation is "to model, replicate, duplicate the behavior, appearance or properties of". Not much difference. Emulation comes from æmulus, "striving, rivaling," and is related to "imitate" and "image," which suggests a surface-lever resemblance. "Simulation" comes from similis "like", as does the word "similar," which perhaps suggests a deeper congruence.
References:
Wikipedia: Emulator
Wikipedia: Computer Simulation
Wiktionary: emulation
Wiktionary: simulation
Etymology Online: emulation
Etymology Online: simulation

I don't think emulator and simulator can be compared. Both mimic something, but are not part of the same scope of reasonning, they are not used in the same context.
In short: an emulator is designed to copy some features of the orginial and can even replace it in the real environment. A simulator is not desgined to copy the features of the original, but only to appear similar to the original to human beings. Without the features of the orginal, the simulator cannot replace it in the real environment.
An emulator is a device that mimics something close enough so that it can be substituted to the real thing. E.g you want a circuit to work like a ROM (read only memory) circuit, but also wants to adjust the content until it is what you want. You'll use a ROM emulator, a black box (likely to be CPU-based) with a physical and electrical interfaces compatible with the ROM you want to emulate. The emulator will be plugged into the device in place of the real ROM. The motherboard will not see any difference when working, but you will be able to change the emulated-ROM content easily. Said otherwise the emulator will act exactly as the actual thing in its motherboard context (maybe a little bit slower due to actual internal model) but there will be additional functions (like re-writing) visible only to the designer, out of the motherboard context. So emulator definition would be: something that mimic the original, has all of its functional features, can actually replace it to some extend in the real world, and may have additional features not visible in the normal context.
A simulator is used in another thinking context, e.g a plane simulator, a car simulator, etc. The simulation will take care only of some aspect of the actual thing, usually those related to how a human being will perceive and control it. The simulator will not perform the functions of the real stuff, and cannot be sustituted to it. The plane simulator will not fly or carry someone, it's not its purpose at all. The simulator is not intended to work, but to appear to the pilot somehow like the actual thing for purposes other than its normal ones, e.g. to allow ground training (including in unusual situations like all-engine failure). So simulator definition would be: something that can appear to human, to some extend, like the original, but cannot replace it for actual use. In addition the pilot will know that the simulator is a simulator.
I don't think we'll see any ROM simulator, because ROM are not interacting with human beings, nor we'll see any plane emulator, because planes cannot have a replacement performing the same functions in the real world.
In my view the model inside an emulator or a simulator can be anything, and has not to be similar to the model of the original. A ROM emulator model will likely be software instead of hardware, MS Flight Simulator cannot be more software than it is.
This comparison of both terms will contradict the currently selected answer (from Toybuilder) which puts the difference on the internal model, while my suggestion is that the difference is whether the fake can or cannot be used to perform the actual function in the actual world (to some accepted extend, indeed).
Note that the plane simulator will have also to simulate the earth, the sun, the wind, etc, which are not part of the plane, so a plane simulator will have to mimic some aspects of the plane, as well as the environment of the plane because it is not used in this actual environment, but in a training room.
This is a big difference with the emulator which emulates only the orginal, and its purpose is to be used in the environment of the original with no need to emulate it. Back to the plane context... what could be a plane emulator? Maybe a train that will connect two airports -- actually two plane steps -- carrying passengers, with stewardesses onboard, with car interior looking like an actual plane cabin, and with captain saying "ladies and gentlemen our altitude is currenlty 10 kms and the temperature at our destination is 24°C". Its benefit is difficult to see, hum...
As a conclusion, the emulator is a real thing intended to work, the simulator is a fake intended to trick the user.

To understand the difference between a simulator and an emulator, keep in mind that a simulator tries to mimic the behavior of a real device. For example, in the case of the iOS Simulator, it simulates the real behavior of an actual iPhone/iPad device. However, the Simulator itself uses the various libraries installed on the Mac (such as QuickTime) to perform its rendering so that the effect looks the same as an actual iPhone. In addition, applications tested on the Simulator are compiled into x86 code, which is the byte-code understood by the Simulator. A real iPhone device, conversely, uses ARM-based code.
In contrast, an emulator emulates the working of a real device. Applications tested on an emulator are compiled into the actual byte-code used by the real device. The emulator executes the application by translating the byte-code into a form that can be executed by the host computer running the emulator.
To understand the subtle difference between simulation and emulation, imagine you are trying to convince a child that playing with knives is dangerous. To simulate this, you pretend to cut yourself with a knife and groan in pain. To emulate this, you actually cut yourself.

In more or less normal parlance: If your software can do everything the mimicked system can do, it's an emulator. If it only approximates the results of a system (IT or otherwise), it's a simulator.

An emulator is a model of a system which will accept any valid input that that the emulated system would accept, and produce the same output or result. So your software is an emulator, only if it reproduces the behavior of the emulated system precisely.

Some years ago I came up with a very short adage that, I believe, captures the essence of the difference quite nicely:
A simulator is an emulator on a mission.
By that I mean that you use an emulator when you can't use the real thing, and you use a simulator when you can't use the real thing and you want to find something out about it.

Simple Explanation.
If you want to convert your PC (running Windows) into Mac, you can do either of these:
(1) You can simply install a Mac theme on your Windows. So, your PC feels more like Mac, but you can't actually run any Mac programs. (SIMULATION)
(or)
(2) You can program your PC to run like Mac (I'm not sure if this is possible :P ). Now you can even run Mac programs successfully and expect the same output as on Mac. (EMULATION)
In the first case, you can experience Mac, but you can't expect the same output as on Mac.
In the second case, you can expect the same output as on Mac, but still the fact remains that it is only a PC.

Simulator: it is similar to interpreter.
i.e. it actually executes the real code in line by line to mimic the behaviour
Emulator: it is similar executable.
i.e. it takes compiled code and executes it.

The distintion between the two terms is a bit fuzzy. Coming from a world where "Emulators" are pieces of hardware that allow you debug embedded systems. And remember products that allowed you to have ICE (In Circuit Emulation) capabilities to debug a PC platform, I find the use of the term "Emulation" to be a somewhat of a misnomer for software that SIMULATES the behaviour of a piece of hardware.
My justification for the current use of the term is Emulation is that it may "augment" the functionality, and only is concerned with a "reasonable" approximation of the behaviour of the system.
ICE: (In Circuit Emulation)
A piece of hardware that is plugged into a board in place of the actual processor. It allows you to run the system as if the actual processor was present. Typically these have a variant of the processor on them to actually execute the software with glue logic to allow the user to break executation and single step under hardware control. Some would also provide logging capability. Most modern processors development systems have replace ICE type emulation with JTAG Emulation, where the JTAG just talks to the processor via a special purpose serial link and all execution is perform by the processor mounted on the board.
Software EMULATOR:
An 0x86 emulator is only concerned with being able to execute 0x86 assembly language, not providing accurate cycle per cycle behaviourial model of a SPECIFIC 0x86 processor. Bochs is an example of this. QEMU does this, but also allows "virtualization" using special kernel modules.
SIMULATOR:
Texas Instruments provides a CYCLE ACCURATE behaviourial model of there processors for software development that is intended to be a accurate SIMULATION of SPECIFIC processor cores behavior for the developers to use prior to having working hardware.
Software EMULATOR augmenting functionality:
BLEEM not only allowed you to run Playstation Software, but also allowed the display to be output with higher resolution than the Playstation was able to provide, and also took advantage of more advanced capabilities of GPUs that were avaliable. (i.e. Better blending and smoothing of textures.)

An emulator is an alternative to the real system but a simulator is used to optimize, understand and estimate the real system.

A simulation is a system that behaves similar to something else, but is implemented in an entirely different way. It provides the basic behavior of a system but may not necessarily abide by all of the rules of the system being simulated. It is there to give you an idea about how something works.
An emulation is a system that behaves exactly like something else, and abides by all of the rules of the system being emulated. It is effectively a complete replication of another system, right down to being binary compatible with the emulated system's inputs and outputs, but operating in a different environment to the environment of the original emulated system. The rules are fixed, and cannot be changed or the system fails.

Both terms are something completely different and only intersect very little. To find the right term is actually very easy, just think about following:
A simulation does not do anything for real. You can study it, for example how computer work, but it usually has no outcome other than that. A plane crash in a Flight Simulator causes no real harm. A weather forecast simulation itself does not change the weather.
An emulation does something for real. You can work with an emulated computer like with a physical one and create documents with it. And a plane crash in a Flight Emulator would have an outcome, like people experiencing the real impact including possible physical harm.
Your confusion probably stems from the fact, that "studying the simulation" and "accessing the emulation" often is quite the same thing.
You are not alone with your confusion. The Film "Matrix" speaks of a simulation. However The Matrix is running an emulation, as it has real impact on all members of The Matrix. In contrast the training room has no real impact, so this is a simulation (of The Matrix).
Let's see some examples.
Simulated vs. Emulated Rain
Take a water hose in the garden and let it rain. What's the difference between simulation and emulation here?
When you are simulating rain, people still will blame you for getting wet. Your rain has some real impact on the world, but your simulation hasn't, as the simulation does not fool anybody in that it is real rain.
In contrast, when you are emulating rain, people would blame the weather. This is, your emulated rain really behaves like rain in reality.
This rain emulation hence distorts reality,
in making people belie in the wrong culprit.
It took me quite some time to understand that.
Hence it isn't easy nor obvious which explains all the confusion.
Keep in mind that a simulation can have sideeffects,
like the weather forecast is based on simulations,
which takes quite some computing power and thus electrical energy,
which has an environmental impact.
Hence in the example of "simulated rain", people getting wet just is a sideffect and not part of the simulation. Same is true if you simulate a rainbow with this simulated rain. While the property of "how rainbows work" is part of this simulation, the simulation itself is not providing the rainbow, this just happens due to refraction of the sun on the sideffect of the waterdrops.
Simulated vs. Emulated Computer
While you might think "a simulated computer can have an outcome" this is practically wrong reasoning. If you save files onto a simulated harddrive, these files cannot leave the simulated drive outside of the simulation. You can obtain the files by studying the simulated drive, but this is not part of the simulation itself.
In case the harddrive saves the data such, that the data is actually usable outside of the simulation, you have an emulated harddrive within the simulation to do so.
So an emulation can be part of a simulation and vice versa.
Simulated vs. Emulated Filesystem
If you simulate a filesystem, you probably, for practicability, will choose to save the files onto your real filesystem as-is (perhaps with some additional meta-information). In that case the simulation seems to create real "value" outside of the simulation: Usable files!
But this is just by coincidence, because your simulated filesystem actually emulates a filesystem as well. You actually emulated the outside filesystem inside your simulation!
Simulated vs. Emulated TPM or HSM
A good example of the difference is, when you think of security. A TPM is a specific device to keep it's own keys secure (source of identity) while an HSM is a general device to secure foreign keys (verify identity).
Fun Fact: My fingers constantly type TMP instead of TPM.
If you simulate a TPM this has a huge effect on security, because then you can observe the internal states of the TPM. Which renders all the security void. Even that such a simulation can give you valuable hints of improving the design of a TPM itself, you won't want to expose precious data to the simulated TPM for real.
However if you emulate a TPM you will try to hide these internal states to the outside as good as you can. Such an emulated TPM then can be possibly used to really secure something else better than without it.
With a real TPM you cannot emulate the properties of a real HSM. All you can archive is to simulate an HSM, but this will not have the security properties of a real HSM, so all data which is stored in this simulated HSM will not be protected (they will only be protected within the simulation itself).
In contrast, with a real HSM you can emulate a TPM with all properties of a real TPM. For this the HSM needs to be constructed such, that no information needs to leave the HSM which does not leave a TPM as well.
(Please note that I do not know anything about HSMs or TPMs in particular, so it might be that there are no HSMs out there which are able to provide emulated TPMs.)
Simulated vs. Emulated World
If our world is simulated, we are simulations, too. Hence some spectator (let's call her God) can look at us and change the simulation any time. Also we cannot find out if we are simulated or not. As I am pretty sure that I know that I am, I do not think I am simulated, because self-awareness looks like an effect with a real component to me, which contradicts simulation. This also means, our world cannot be a simulation, too, as a simulation can only affect me like the world does, if I am part of the simulation.
But our world still can be emulated (like in the Film "Matrix"), as all I have to "prove the world" is my state of mind and sensory input, which I cannot verify, as I cannot leave myself. If I am not part of the emulation, then there should be a chance to observe discontinuity (like in the film "Matrix"), in case the emulation does not work flawlessly.
This changes when I emulated, too, like running an OS in an emulator. Then I cannot observe such errors, as my state can be reset from within the emulation (call it: Sleep) without observable discontinuation.
However I rather think that the world is a holographic hallucination than something like an emulation. Because if it is emulated, then I am pwned by somebody (call him Rick) who is running the emulation for some purpose, while a hallucination is purely my own thing.
I stop here, because hallucinations lead us to something completely different.

This question is probably best answered by taking a look at historical practice.
In the past, I've seen gaming console emulators on PC for the PlayStation & SEGA.
Simulators are commonplace when referring to software that tries to mimic real life actions, such as driving or flying. Gran Turismo and Microsoft Flight Simulator spring to mind as classic examples of simulators.
As for the linguistic difference, emulation usually refers to the action of copying someone's (or something's) praiseworthy characteristics or behaviors. Emulation is distinct from imitation, in which a person is copied for the purpose of mockery.
The linguistic meaning of the verb 'simulation' is essentially to pretend or mimic someone or something.

In computer science both a simulation and emulation produce the same outputs, from the same inputs, that the original system does; However, an emulation also uses the same processes to achieve it and is made out of the same materials. A simulation uses different processes from the original system. Also worth noting is the term replication, which is the intermediate of the two - using the same processes but being made out of a different material.
So if I want to run my old Super Mario Bros game on my PC I use an SNES emulator, because it is using the same or similar computer code (processes) to run the game, and uses the same or similar materials (silicon chip).
However, if I want to fly a Boeing 747 jet on my PC I use a flight simulator because it uses completely different processes from the original (there are no actual wings, lift or aerodynamics involved!).
Here are the exact definitions taken from a computer science glossary:
A simulation is a model of a system that captures the functional connections between inputs and outputs of the system, but without necessarily being based on processes that are the same as, or similar to, those of the system itself.
A replication is a model of a system that captures the functional connections between inputs and outputs of the system and is based on processes that are the same as, or similar to, those of the system itself.
An emulation is a model of some system that captures the functional connections between inputs and outputs of the system, based on processes that are the same as, or similar to, those of that system, and that is built of the same materials as that system.
Reference: The Open University, M366 Glossary 1.1, 2007

Both are models of an object that you have some means of controlling inputs to and observing outputs from.
The key difference is that:
With an emulator, you want the output exactly match what the object you are emulating would produce.
With a simulator, you want certain properties of your output to be similar to what the object would produce.
Let me give an example -- suppose you want to do some system testing to see how adding a new sensor (like a thermometer) to a system would affect the system. You know that the thermometer sends a message 8 time a second containing its measurement.
Simulation -- if you do not have the thermometer yet, but you want to test that this message rate will not overload you system, you can simulate the sensor by attaching a unit that sends a random number 8 times a second. You can run any test that does not rely on the actual value the sensor sends.
Emulation -- suppose you have a very expensive thermometer that measures to 0.001 C, and you want to see if you can get by with a cheaper thermometer that only measures to the nearest 0.5 C. You can emulate the cheaper thermometer using an expensive thermometer and then rounding the reading to the nearest 0.5 C and running tests that rely on the temperature values.
Note that simulations can also be used for forecasting or predicting behavior. Finite element analysis simulations are used in many applications, including weather prediction and virtual wind tunnels.
The definitions of the terms:
emulation -- surpass or exactly match
simulate -- imitate in appearance or character

The definitions of the words describe the difference the best. A google search gives the following definitions of simulate and emulate:
simulate imitate the appearance or character of.
emulate match or surpass (a person or achievement), typically by imitation.
A simulation imitates a system. An emulation simulates a system so well that it could replace it or may even surpass it.
In computing, an emulation would be a drop in replacement for the system it is emulating. Often times it will even outperform the system it is imitating. For example, game console emulators usually make improvements such as greater hardware compatibility, better performance, and improved audio/video quality.
Simulations, on the other hand, are limited by them being models. They are a best attempt to mimic a system, but not replacements for it. There are hardware emulators because hardware can be imitated and it would be hard to tell the difference. There is no Farming Emulator because there is no emulation that could replace actual farming. We can only simulate a model of farming to gain insight on how to farm better.

A Virtual PC tries to emulate a Computer, from the point of view of a Programmer BUT, at the same time, it simulates a Computer from the point of view of a Electrical Engineer.

Simulator is something more broader than Emulator and it seems like the duality of this terms is overthought in the posts above.
Emulator
People decided to use a new word emulation in the "computer world" when they started replacing some hardware parts of the existing system in straightforward manner - imitating their behaviour and relying on the computational nature to be sure to not break something and leave everything in the equivalent state. So we have emulated the piece of this! (and the whole still works as before)
Emulator usually used in narrow sense in digital area as replacement and virtualization - presenting in digital form as a piece of software - of something known and existed before (virtual chips, circuit boards, electronic devices). So when the world became more digital and brought the emulator word to the masses, the masses added uncertainty to it (or additional reasons).
Simulator
First of all, I saw many comments about emulators do or replace something real but simulators not.
BUT flight simulator is used for a real thing - it trains pilots, gives them skill up and knowledge and it replaces expensive real planes and saves much of money. And we cannot just say a plane-emulator because we have inner feeling that this is much more than that, so we call it simulator :) Plane simulator could contain emulated radar or transponder that is true.
Contra-statements that simulators are used for analysis and study (and emulators for something real), but that analysis and study not less a real thing than emulated GSM boards (even more in the informational age we live in). Analysis adds a value to the business, cuts costs or points out to profits not less than the replaced (emulated) hardware.
Simulator is similar to modelling of something that we can't obtain for some reason (cost, technology, physical impossibility). It is usually simulated for something new or intangible or complex or not properly known to us like market, weather, combustion, user. So here comes the flight, black hole, stock exchange, simulations.
So finally:
Simulator is broader than Emulator
Simulator tends to imitate/model more global processes/things in general with ability to narrow the imitation down (e.g. capacitor simulator with presets representing some known models)
Emulator tends to imitate certain hardware devices with certain specification, known characteristics and properties (e.g. SNES emulator, Intel 8087 or Roland TB-303)
As for words origin
All came from Latin and mean:
emulate is "to be equal" (looks like more aggressive and straightforward - rivalry)
simulate is "to be similar" (looks like more sly and tricky - imitation)

Emulator:
Consider a situation that you know only English and you are in China. In order to interact with a Chinese person you need a translator. Now, role of translator is that it will seek input from you in English and convert to Chinese and and give that input to the Chinese person and gets response from the Chinese person and convert to English and give the output to you in English. Now that translator and Chinese person is the emulator. Both combine will provide similar functionality as if you were communicating with the English person. So hardware may be different but functionality will be same.
Simulator:
I can't give better example than SPICE or flight simulator. Both will replace hardware component behavior with the software or mathematical model which will behave similar to the hardware.
In the end it depends on the context that which solution better suits project needs.

Emulation is like
Abstruction.
It shows what it can do.
Example: Car driving emulation.
Simulation is like
Encaptulation.
It shows how it can do
Example: Car engine inner activity.

The simulator is necessarily a scale model.
Emulators pretend to be a 1:1 model.

Related

Term for a smallish program built on top a larger library/framework?

What exactly do you call a relatively smallish program/application built on top a larger library/framework?
The small program (foo) in question does a whole lot but much of the heavy lifting is done by the library (bar) on top of which this program is built. When I say that I designed/developed 'foo' with such and such capabilities I do not want to convey the wrong idea that I coded everything, including the low level stuff, all by myself.
Edit: Just to clarify, this is a numerical code built on top of a numerical library.
TL;DR, almost nobody is going to think you invented everything. If they do, it's a good opportunity to educate them high-level information about computers and software architecture.
In general, this is just an application, utility, or tool. Your runtime context may throw in additional adjectives (e.g. command-line tool, web application, etc).
I think your worries regarding attribution are probably unfounded. If your project is open source, your documentation will certainly need to list the build and runtime dependencies. If there are different licenses, you'll likely also have to ship those with your tool. So it's unlikely that anyone other than people entirely unfamiliar with software engineering would get the "wrong idea".
Furthermore, nearly every software package is built on top of some sort of toolkit. For example, even basic utilities like ls, cp, etc. are built on top of the standard C library and make use of system calls provided by the operating system. Indeed, without the OS, such utilities have no runtime environment in which to execute. The OS has nothing to do if there is no hardware for it to manage (and even some of that hardware is likely to have firmware -- which is just software-on-chip -- to control some of its behavior regardless of an operating system).
The higher up the stack you move, the harder it becomes for someone to mistake the work you did versus the work you built upon. A web application needs an HTTP server, possibly a module interface or CGI environment, a language to express the intent of the software, etc. And then all of this is built on top of the OS, which goes down to hardware, some firmware, etc.
Finally, even if the library does the heavy lifting, that doesn't detract from the value of your software. If your software does a number of very useful things, it doesn't matter whether the library enabled your software to do those things. Some of the most important inventions in history are super simple in retrospect. It just took someone to see how to combine the parts in a different way. This is effectively what we do with software.
If someone does seem to get the wrong idea, this is perhaps a good time to educate them about the complexities of computing environments, the interrelationships between software components, the software stack, etc. It also might be fine to just let it slide and say, "Thank you!"

Benefits of cross-platform development?

Are there benefits to developing an application on two or more different platforms? Does using a different compiler on even the same platform have benefits?
Yes, especially if you plan to distribute your code for multiple platforms.
But even if you don't cross platform development is a form of futureproofing; if it runs on multiple (diverse) platforms today, it's more likely to run on future platforms than something that was tuned, tweeked, and specialized to work on a version 7.8.3 clean install of vendor X's Q-series boxes (patch level 1452) and nothing else.
There seems to be a benefit in finding and simply preventing bugs with a different compiler and a different OS. Different CPUs can pin down endian issues early. There is the pain at the GUI level if you want to stay native at that level.
Short answer: Yes.
Short of cloning a disk, it is almost impossible to make two systems exactly alike, so you are going to end up running on "different platforms" whether you meant to or not. By specifically confronting and solving the "what if system A doesn't do things like B?" problem head on you are much more likely to find those key assumptions your code makes.
That said, I would say you should get a good chunk of your base code working on system A, and then take a day (or a week or ...) and get it running on system B. It can be very educational.
My education came back in the 80's when I ported a source level C debugger to over 100 flavors of U*NX. Gack!
Are there benefits to developing an application on two or more different platforms?
If this is production software, the obvious reason is the lure of a larger client base. Your product's appeal is magnified the moment the client hears that you support multiple platforms. Remember, most enterprises do not use a single OS or even a single version of the OS. It is fairly typical to find a section using Windows, another Mac and a smaller version some flavor of Linux.
It is also seen that customizing a product for a single platform is often far more tedious than to have it run on multi-platform. The law of diminishing returns kicks in even before you know.
Of course, all of this makes little sense, if you are doing customization work for an existing product for the client's proprietary hardware. But even then, keep an eye out for the entire range of hardware your client has in his repertoire -- you never know when he might ask for it.
Does using a different compiler on even the same platform have benefits?
Yes, again. Different compilers implement different extensions. See to it that you are not dependent on a particular version of a particular compiler.
Further, there may be a bug or two in the compiler itself. Using multiple compilers helps sort these out.
I have further seen bits of a (cross-platform) product using two different compilers -- one was to used in those modules where floating point manipulation required a very high level of accuracy. (Been a while I've heard anyone else do that, but ...)
I've ported a large C++ program, originally Win32, to Linux. It wasn't very difficult. Mostly dealing with compiler incompatibilities, because the MS C++ compiler at the time was non-compliant in various ways. I expect that problem has mostly gone now (until C++0x features start gradually appearing). Also writing a simple platform abstraction library to centralize the platform-specific code in one place. It depends to what extent you are dependent on services from the OS that would be hard to mimic on a new platform.
You don't have to build portability in from the ground up. That's why "porting" is often described as an activity you can perform in one shot after an initial release on your most important platform. You don't have to do it continuously from the very start. Purely for economic reasons, if you can avoid doing work that may never pay off, obviously you should. The cost of porting later on, when really necessary, turns out to be not that bad.
Mostly, there is an existing platform where the application is written for (individual software). But you adress more developers (both platforms), if you decide to provide an independent language.
Also products (standard software) for SMEs can be sold better if they run on different platforms! You can gain access to both markets, WIN&LINUX! (and MacOSx and so on...)
Big companies mostly buy hardware which is supported/certified by the product vendor only to deploy the specified product.
If you develop on multiple platforms at the same time you get the advantage of being able to use different tools. For example I once had a memory overwrite (I still swear I didn't need the +1 for the null byte!) that cause "free" to crash. I brought the code up to speed on Windows and found the overwrite in about 1 minute with Rational Purify... it had taken me a week under Linux of chasing it (valgrind might have found it... but I didn't know about it at the time).
Different compilers on the same or different platforms is, to me, a must as each compiler will report different things, and sometimes the report from one compiler about an error will be gibberish but the other compiler makes it very clear.
Using things like multiple databases while developing means you are much less likely to tie yourself to a particular database which means you can swap out the database if there is a reason to do so. If you want to integrate something that uses Oracle into a existing infrastructure that uses SQL Server for example it can really suck - much better if the Oracle or SQL Server pieces can be moved to the other system (I know of some places that have 3 different databases for their financial systems... ick).
In general, always developing for two or three things means that the odds of you finding mistakes is better, and the odds of the system being more flexible is better.
On the other hand all of that can take time and effort that, at the immediate time, is seen as an unneeded expense.
Some platforms have really dreadful development tools. I once worked in an IB where rather than use Sun's ghastly toolset, peole developed code in VC++ and then ported to Solaris.

Usability: speech recognition versus keypad

We are seeing more and more speech recognition implemented and request for libraries that does good speech recognition. What's the rationale (in term of usability) behind it versus a keyboard or keypad? What reasons would you have to invest in this development?
For example, let's take the call centers. A few years ago, almost every call center used an IVR that prompted for a key for the menus. Now, we're seeing more and more menus with prompt for a spoken keyword and/or a pressed keypad: "please say invoice or press 1 to see your invoice". Or we are seeing the same thing in companies' phone directory: "please say the name of the person you are trying to reach" ... "Franck Loyd" ... "Did you say Jack Freud? Please say yes if you want to reach this person or say no to try again".
I guess it's a plus when you're in your car without holding your phone but is it worth the additional waiting time? Longer interaction for all the choices, longer prompt time while trying to analyze if something was said and so on? Also, reliability is better than it was, definitely, but sometime it feels more like an toy someone decided to plugged into the system so it can feel futuristic.
Any experience designing IVR or software that used (or chose not to) speech recognition?
Thanks!
What's the rationale (in term of
usability) behind it versus a keyboard
or keypad?
Usability is a very broad term. If I were to attempt to enter my address with a touch pad, it wouldn't be considered very usable. Some argue that using a speech engine with an overall success rate of 70-80% isn't very usable either. As indicated in other posts, hands free input can be much easier for those on a mobile phone. However, using words versus numeric input can actually be less intuitive than a touch tone phone if the topic is somewhat foreign to the caller. A caller hearing terms and phrases that aren't very familiar can't remember them in the 10-30 seconds of the prompt but they can hover over the best sounding choice with their finger or remember the order of choices.
What reasons would you have
to invest in this development?
This is an odd question. Usually the decision to use speech or not in an IVR environment is not driven from the development view of the world. Unless you have a specific requirement that really requires speech, you are almost always reducing overall success rates. Speech is usually a factor of corporate image ... or having the latest technological toy.
I guess it's a plus when you're in your car without holding your phone
but is it worth the additional waiting time?
Speech recognition latencies aren't very high these days when using modern ASRs. In most cases, input is handled in parallel with speech and time between end of speech recognition is .5 to 1s. Be aware that many IVRs then need to perform data look-ups after some inputs and this can appear as a slower system. Normal inputs pushing beyond 1s is usually the sign of an under-powered deployment.
It may not have been under-powered when original implemented, but through tuning efforts, you make a lot of performance versus accuracy decisions. To get that next .1%, resources can be pushed beyond what they should be at peak.
Also, reliability is better than it was, definitely,
but sometime it feels more like an toy someone decided
to plugged into the system so it can feel futuristic.
In general, yes. On the reliability note, you need to really look at the overall numbers to get a sense of the system. It is a battle of statistics where the individual isn't very important (unless they hold the title of VP or above). Through optimization of the input (shifting prompting), resource usage and other speech reco tuning parameters you attempt to maximize accuracy. For basic natural language responses, you can get in the upper 90s. However, your overall success rate is much lower. Imagine 5 prompts all at 98% (in reality, you tend to have a bunch 99 and then a few mid 90s or slightly below): .98 * .98 * .98 * .98 * .98 = 90%. That means 1 out of 10 failing. That is before caller confusion and business rules. DTMF input is usually very near 100%, even after several inputs.
Any experience designing IVR or software that
used (or chose not to) speech recognition?
Yes. But, I suspect that really isn't the question you want. As someone on the technology side, this is usually not your decision and you have limited influence on it. If you are really looking for the pros/cons of speech:
Pros:
Cool/hip (note, speech alone isn't sufficient. You need a great VUI and voice talent)
Good for a highly mobile crowd that shuns ear pieces. The future is supposed to be blending speech with tactile input. Maybe. It probably won't come from the IVR side of the market.
Good for tasks that can't be done with DTMF. Note, many of these problems tend to have low success rates in speech as well. Cost (versus humans) is usually the driving factor not usability. Dropping a call into a voicemail box for things like address change can be very cost effective.
Cons:
Expensive to development, deploy and maintain. Adding new choices can have a significant impact on success rates if you aren't careful. Always monitor the impact of change.
Is often deployed inappropriately. For example, just say your numeric menu choice. This is nearly often a case of we want speech coolness, but can't afford what it really takes to achieve speech coolness.
Success rates will be lower and therefore call center costs will be higher.
Failures tend to focus on specific prompts and individual callers. A caller that regularly experiences problems with your system will be very unhappy with you.
Callers get angry when they aren't understood. Is your goal to identify a subset of your customer base and really get them angry ?
I think that speech-recognition like any method of input has it's pro's and con's.
Pro's
No learning curve, we have been speaking since a very young age.
Very user-intuitive.
On the phone, no need to constantly move the headset from your ear.
Con's
Longer wait time
If bad sound quality, takes multiple attempts to get the selection right.
In some cases a company is required to handle rotary phones. It might be found as more cost affective to just setup the recognition system instead of both.
Voice recognition has a lot more overhead than touch tones. If you want the best results you need to constantly tweak the app and train the system on unrecognized word pronunciations. You also need to be very particular on how you prompt the user with voice recognition or you may get unexpected responses.
Overall touch tone is a lot easier as there are only a limited set of possible options at any given time.
If your app is straight forward enough you voice rec many only complicate it. Press 2 for some other language..
Speech recognition is definetly the wave of the future when combined with touchscreen technology. As example I use tazti speech recognition. It's available in XP and Vista version. Since Microsoft's touchscreen "Surface" platform runs on Vista, I'm sure tazti will work with the touchscreen technology. When I tried tazti speech recognition the built in commands worked great. Also it let's me create my own speech commands and those also work great. Voice searching Google and Yahoo, Wikipedia Youtube and many other search engines works great. Has many other features as well. But it doesn't have dictation. I found that I eliminate 70% or more of my internet generated clicks.... maybe more. NOTE: Tazti is a free download from their website.

A brilliant example of effective encapsulation through information hiding?

"Abstraction and encapsulation are complementary concepts: abstraction focuses on the observable behavior of an object... encapsulation focuses upon the implementation that gives rise to this behavior... encapsulation is most often achieved through information hiding, which is the process of hiding all of the secrets of object that do not contribute to its essential characteristics." - Grady Booch in Object Oriented Analysis and Design
Can you show me some powerfully convincing examples of the benefits of encapsulation through information hiding?
The example given in my first OO class:
Imagine a media player. It abstracts the concepts of playing, pausing, fast-forwarding, etc. As a user, you can use this to operate the device.
Your VCR implemented this interface and hid or encapsulated the details of the mechanical drives and tapes.
When a new implementation of a media player arrives (say a DVD player, which uses discs rather than tapes) it can replace the implementation encapsulated in the media player and users can continue to use it just as they did with their VCR (same operations such as play, pause, etc...).
This is the concept of information hiding through abstraction. It allows for implementation details to change without the users having to know and promotes low coupling of code.
The *nix abstraction of character streams (disk files, pipes, sockets, ttys, etc.) into a single entity (the "everything is a file") model allows a wide range of tools to be applied to a wide range of data sources / sinks in a way that simply would not be possible without the encapsulation.
Likewise, the concept of streams in various languages, abstracting over lists, arrays, files, etc.
Also, concepts like numbers (abstracting over integers, half a dozen kinds of floats, rationals, etc.) imagine what a nightmare this would be if higher level code was given the mantissa format and so forth and left to fend for itself.
I know there's already an accepted answer, but I wanted to throw one more out there: OpenGL/DirectX
Neither of these API's are full implementations (although DirectX is certainly a bit more top-heavy in that regard), but instead generic methods of communicating render commands to a graphics card.
The card vendors are the ones that provide the implementation (driver) for a specific card, which in many cases is very hardware specific, but you as the user need never care that one user is running a GeForce ABC and the other a Radeon XYZ because the exact implementation is hidden away behind the high-level API. Were it not, you would need to have a code path in your games for every card on the market that you wanted to support, which would be completely unmanageable from day 1. Another big plus to this approach is that Nvidia/ATI can release a newer, more efficient version of their drivers and you automatically benefit with no effort on your part.
The same principle is in effect for sound, network, mouse, keyboard... basically any component of your computer. Whether the encapsulation happens at the hardware level or in a software driver, at some point all of the device specifics are hidden away to allow you to treat any keyboard, for instance, as just a keyboard and not a Microsoft Ergonomic Media Explorer Deluxe Revision 2.
When you look at it that way, it quickly becomes apparent that without some form of encapsulation/abstraction computers as we know them today simply wouldn't work at all. Is that brilliant enough for you?
Nearly every Java, C#, and C++ code base in the world has information hiding: It's as simple as the private: sections of the classes.
The outside world can't see the private members, so a developer can change them without needing to worry about the rest of the code not compiling.
What? You're not convinced yet?
It's easier to show the opposite. We used to write code that had no control over who could access details of its implementation. That made it almost impossible at times to determine what code modified a variable.
Also, you can't really abstract something if every piece of code in the world might possibly have depended on the implementation of specific concrete classes.

How would you go about reverse engineering a set of binary data pulled from a device?

A friend of mine brought up this questiont he other day, he's recently bought a garmin heart rate moniter device which keeps track of his heart rate and allows him to upload his heart rate stats for a day to his computer.
The only problem is there are no linux drivers for the garmin USB device, he's managed to interpret some of the data, such as the model number and his user details and has identified that there are some binary datatables essentially which we assume represent a series of recordings of his heart rate and the time the recording was taken.
Where does one start when reverse engineering data when you know nothing about the structure?
I had the same problem and initially found this project at Google Code that aims to complete a cross-platform version of tools for the Garmin devices ... see: http://code.google.com/p/garmintools/. There's a link on the front page of that project to the protocols you need, which Garmin was thoughtful enough to release publically.
And here's a direct link to the Garmin I/O specification: http://www.garmin.com/support/pdf/IOSDK.zip
I'd start looking at the data in a hexadecimal editor, hopefully a good one which knows the most common encodings (ASCII, Unicode, etc.) and then try to make sense of it out of the data you know it has stored.
As another poster mentioned, reverse engineering can be hairy, not in practice but in legality.
That being said, you may be able to find everything related to your root question at hand by checking out this project and its' code...and they do handle the runner's heart rate/GPS combo data as well
http://www.gpsbabel.org/
I'd suggest you start with checking the legality of reverse engineering in your country of origin. Most countries have very strict laws about what is allowed and what isn't regarding reverse engineering devices and code.
I would start by seeing what data is being sent by the device, then consider how such data could be represented and packed.
I would first capture many samples, and see if any pattern presents itself, since heart beat is something which is regular and that would suggest it is measurement related to the heart itself. I would also look for bit fields which are monotonically increasing, as that would suggest some sort of time stamp.
Having formed a hypothesis for what is where, I would write a program to test it and graph the results and see if it makes sense. If it does but not quite, then closer inspection would probably reveal you need some scaling factors here or there. It is also entirely possible I need to process the data first before it looks anything like what their program is showing, i.e. might need to integrate the data points. If I get garbage, then it is back to the drawing board :-)
I would also check the manufacturer's website, or maybe run strings on their binaries. Finding someone who works in the field of biomedical engineering would also be on my list, as they would probably know what protocols are typically used, if any. I would also look for these protocols and see if any could be applied to the data I am seeing.
I'd start by creating a hex dump of the data. Figure it's probably blocked in some power-of-two-sized chunks. Start looking for repeating patterns. Think about what kind of data they're probably sending. Either they're recording each heart beat individually, or they're recording whatever the sensor is sending at fixed intervals. If it's individual beats, then there's going to be a time delta (since the last beat), a duration, and a max or avg strength of some sort. If it's fixed intervals, then it'll probably be a simple vector of readings. There'll probably be a preamble of some sort, with a start timestamp and the sampling rate. You can try decoding the timestamp yourself, or you might try simply feeding it to ctime() and see if they're using standard absolute time format.
Keep in mind that lots of cheap A/D converters only produce 12-bit outputs, so your readings are unlikely to be larger than 16 bits (and the high-order 4 bits may be used for flags). I'd recommend resetting the device so that it's "blank", dumping and storing the contents, then take a set of readings, record the results (whatever the device normally reports), then dump the contents again and try to correlate the recorded results with whatever data appeared after the "blank" dump.
Unsure if this is what you're looking for but Garmin has created an API that runs with your browser. It seems OSX is supported, as well as Windows browsers... I would try it from Google Chromium to see if it can be used instead of this reverse engineering...
http://developer.garmin.com/web-device/garmin-communicator-plugin/
API Features
Auto-detection of devices connected to a computer Access to device
product information like product name and software version Read
tracks, routes and waypoints from supported recreational, fitness and
navigation devices Write tracks, routes and waypoints to supported
recreational, fitness and navigation devices Read fitness data from
supported fitness devices Geo-code address and save to a device as a
waypoint or favorite Read and write Garmin XML files (GPX and TCX) as
well as binary files. Support for most Garmin devices (USB, USB
mass-storage, most serial devices) Support for Internet Explorer,
Firefox and Chrome on Microsoft Windows. Support for Safari, Firefox
and Chrome on Mac OS X.
Can you synthesize a heart beat using something like a computer speaker? (I have no idea how such devices actually work). Watch how the binary results change based on different inputs.
Ripping apart the device and checking out what's inside would probably help too.