What's a good explanation of statistical machine translation? - language-agnostic

I'm trying to find a good high level explanation of how statistical machine translation works. That is, supposing I have a corpus of non-aligned English, French and German texts, how could I use that to translate any sentence from one language to another ? It's not that I'm looking to build a Google Translate myself, but I'd like to understand how it works in more detail.
I've seen searched Google but come across nothing good, it either quickly needs advanced mathematics knowledge to understand or is way too generalized. Wikipedia's article on SMT seems to be both, so it doesn't really help much. I'm skeptical that this is such a complex area that it's simply not possible to understand without all the mathematics.
Can anyone give, or know of, a general step-by-step explanation of how such a system works, targeted towards programmers (so code examples are fine) but without needing a mathematics degree to understand ? Or a book that's like this would be great too.
Edit: A perfect example of what I'm looking for would be an SMT equivalent to Peter Norvig's great article on spelling correction. That gives a good idea of what it's involved in writing a spell checker, without going into detailed maths on Levenshtein/soundex/smoothing algorithms etc...

Here is a nice video lecture (in 2 parts):
http://videolectures.net/aerfaiss08_koehn_pbfs/
For in-depth details, I highly advise this book:
http://www.amazon.com/Statistical-Machine-Translation-Philipp-Koehn/dp/0521874157
Both are from the guy who created the most widely used MT system in research. It covers all the fundamental stuff, is very well explained and accurate. This probably one of the de-facto standard books that any researcher beginning in this field should read.

The Atlantic Online had a very straightforward nontechnical description of statistical machine translation back in December 1998:
Lost in Translation by Stephen Budiansky
I've read nontechnical stuff on statistical MT before but always wondered "yeah but how does the statistical stuff know which words map to which when word orders vary and supposedly no dictionary and no grammar are used?" Well this article actually does answer that and it's simple and straightforward and I was quite surprised.

A Peter Norvig talk from Google Developer Day 2007, Theorizing from Data: Avoiding the Capital Mistake, contains some accessible high-level explanation of the principles of statstical machine translation (starting from about 21:20).

Related

Toy projects for new programers

When I was first started teaching myself programming, after finishing a tutorial I would feel like I still couldn't do anything in the language. So, I looked around to find something to work on. Since I had just learned a few of the basics, the amount of work involved in finding, reading and adding to an open source project seemed insurmountable. Instead I started on a couple toy projects, which ended up being incredibly beneficial.
Having seen a lot of questions from beginners similar to "what should I do now?" and a lot of answers similar to "start working for an open source project" has made me think there has to be better advice for a new programmer. While working on an open source project surely gives great experience, there is a perceptible barrier to entry.
Instead, I think it would be great if new programmers were prodded towards working on a toy program related to some interest they have. Since there are so many directions that programming can take you, I think it would be interesting to list some simple (but fun/rewarding) projects grouped by the direction the new programmer is looking to pursue. Such as:
Game Design:
Write a text adventure (like Zork)
Natural Language Processing:
Create a program that writes meaningless, but grammatically valid essays.
I recently asked a similar question (Diverse resource of problems to show merits of different languages) and got links to sites that provide problem sets, as well as validation. Check out:
http://www.codechef.com/
https://www.spoj.pl/problems/classical/
http://wiki.python.org/moin/ProblemSets
http://projecteuler.net/
Although these problems don't oftem amount to projects, they are still interesting. I'm interested in seeing what people come up with here.
I actually think that a TopCoder approach might be better... programmers can still pick topics of interests, but they're actually working for a prize on a REAL project and they get feedback. Frankly speaking, TopCoder is a bit of a bloat and as far as I can tell, they don't allow people to make free competitions. It would be great if there is a TopCoder/StackOverflow type of site: people can submit code, get voted on their implementation and just have a good time!
I'll even pitch my idea, I'm starting to work on my own version of TopCoder/StackOverflow hybrid monstrosity called MyDevArmy (although I have not done anything so far except buy the domain).
Write a program which renders Wolfram automata (esp. Rule 110).
See YelloSoft for example code.
Start by writing a Blackjack simulation. Choose whichever strategy you want for the first run.
Next, start adding additional runs for different strategies like hitting/standing when your hand's value is 15 vs. 16 vs. 17 vs. 18, and whether the hand is soft or hard (an ace's value being counted as 1 or 11). The dealer's strategy will be constant, as they really are in casinos.
By the end, your program will run, say, 1000 instances of each strategy combination. It will print out a summary of the rate of hand wins (percentage of times you beat the dealer) for each stand value and hard/soft combination.
This is easily one of my favorite projects I've done and it can really cement some techniques in the language of your choosing. Plus, if you have the initiative to start learning some of the (fairly simple) discrete math that's involved in coming up with the odds of these situations as a side project, you can come away with an even better experience. Who knows, maybe you could ditch this computer stuff and take up card counting?

Programmers dictionary/lexicon for non native speakers

I'm not an English speaker, and I'm not very good at English. I'm self thought. I have not worked together with others on a common codebase. I don't have any friends who program. I don't work with other programmers (at least nobody who cares about these things).
I guess this might explain some of my problems in finding good unambiguous class names. I have tried to find some sort of "Programmers dictionary" containing words often used and their meanings. When reading others code I have to look up words quite often, and as many use abbreviations this poses an additional challenge.
My very limited vocabulary "forces" me to use bad class names like xxManager, xxProvider, xxWhatever. It's usually less problematic choosing variable and method names.
Other non English people out here: How have you managed to cope with this? Have you studied English so well it's not a problem? Or have you read so much code naming comes natural? Or discussed a lot with English speakers? Found any good websites, articles or other publications? As I've never read anything regarding programming in my own language, I often have more problems trying to find the words in my language...
PS: All other posts I've found was regarding mixing native tongue and English... And I understand this might be a bit off topic and might be closed.
Edit: Some resources from the answers and other stuff I use:
Jargon / The New Hacker's Dictionary
Common design patterns
Google translate
Dictionary
The Jargon file will help with the more obscure references people will give in the industry.
http://catb.org/jargon/html/go01.html
Other than that..finding good names for your variables/classes/etc is hard. Often times, it's harder than actually solving the problem. Here's a good resource for some common design pattern names people like to use: http://en.wikipedia.org/wiki/Design_pattern_%28computer_science%29
Examples:
AbcFactory
XyzBridge
Could be an unorthodox suggestion, but I would recommend studying English more deeply (I am also a non-native speaker).
Expose yourself to as much English as possible! Watch movies, read English fiction, listen to technical podcasts.
Mind you, if you really want to deepen your knowledge of English, you're probably not going to learn a lot watching "Transformers". On the other hand, diving into Ulysses probably is not a good strategy either.
If you're feeling adventurous, you could always get a subscription to the New Yorker magazine. It'll do things to you - yes this is flamebaiting. :P
Other non English people out here:
How have you managed to cope with this?
Good naming in code matters. Using English is the preferred, but if you don't know English very well the result could be counterproductive.
I had a friend who just guessed what the correct name would be and the result was horrible. ie
String employiiNeim; // employeeName
int eich; // age
The problem with English, is that is not pronounced as written ( french have this minor ... ehrm characteristic ) Other languages like Spanish, German, Dutch, and others, do type and pronounce every letter in the word.
This becomes particular relevant when what you are coding are business rules or business models. In this case it is much better to use your native language.
String nombreEmpleado;
int edad;
Way much better, specially when you work with others.
Have you studied English so well it's not a problem?
Yeap, there is no other way, and a lot of practice.
You can study English the same way you study programming languages though. You can have a teacher and attend to a class room and study an hour a day. Or ( what I did ) you can just grab something that is interesting to you and try to understand it. For instance, you have a small document describing something you care, you read blogs or read content here at StackOverflow, you translate a song you like, etc. etc.
All these are study forms. There is no other way, you won't wake up one day and say: "...I know kung fu" I mean, and say: ..."I know English"
Or have you read so much code naming comes natural?
Also helps, but if you don't understand what the code means, you ... well won't make any progress.
You'll learn the programming language, and that will help you to understand English bit better, but won't help you to learn it. That's because when we program we learn the programming language not the native language.
Or discussed a lot with English speakers?
Eerhh..nope. If you have that chance go ahead, it will improve your listening and speaking, but not necessarily your writting.
The most effective way to improve your English vocabulary and grammar is by READING ( reading in your native language also improves your own language btw )
So, I would say, read as much as you can. Use your native language while you gain more confidence, and keep studying.
The English will come with time.
If you can't find the "Programmer's Dictionary" you're looking for, start one. Post a new question: "What entries are missing from this Dictionary for English-as-a-Second-Language-Programmers?" and seed it with 10 or 20 words/definitions you've already discovered. Once posters have suggested enough additions, move it to a a wiki somewhere and keep accepting contributions. You might end up creating a valuable resource.
Documenting your code with excellent prose like your question above will go a long way!
If you stick to common design patterns endemic to the language, platform, and architecture for which you're working with, other engineers should understand your nomenclature fairly easily.
If you are worried about it in terms of naming your own objects, just think of what your native word is for what you want to do, then go get an english language translation dictionary, and use the english language version.
How about using your native language?
Of course (like for me as an Austrian) some letters may not be allowed - but who cares if there is Mörder or Moerder (Murder) in the class name :)
Or (as I do) use a dictionary like dict.cc or something else.
I do - think what the class does - it manages game session (for an example) so it will become GameSessionManager.
Abbreviations are (at least for me) a problem - but what I've learned from other code - event native speakers use different abbreviations.
And if the class is called GameSessionMgr or GameSessionMngr doesn't make a difference.
Your are not writing books or some kind of "english poem" where spelling, grammar and... counts.
You write code - and if you follow "your sepcial rules" - you and others will (after some time) be able to understand you code and class names.
It will come with time and experience. Above all attempt to (like #Mike A says) document things until the code becomes clearer and try to be consistent.
This is an issue that I run into as well, even as a native English speaker. As a programmer, I often find that I need to find a descriptive word for a class, variable, function, etc. I often find myself asking a friend or coworker what verbage they would use by explaining my idea, carefully excluding any words I myself have considered as a possible choice for the class/function/variable name so as not to inhibit their creativity.
It seems to me that the English Language & Usage site proposal over at Area51 is a good place to ask such questions as "What would you call a class (or thing) that does this, this and that, and has properties x, y, and z?

Does knowing a Natural Language well help with Programming?

We all hear that math at least helps a little bit with programming. My question though, does English or other natural language skills help with programming? I know it has to help with technical documentation, but what about actual programming? Are certain constructs in a programming language also there in natural languages? Does knowing how to write a 20 page research paper help with writing a 20k loc programming project?
Dijkstra went so far as to say: "Besides a mathematical inclination, an exceptionally good mastery of one's native tongue is the most vital asset of a competent programmer."
Edit: yes, I'm reasonably certain he was talking about the programming part of the job. Here's a bit more complete quote:
The problems of business administration in general and database management in particular are much too difficult for people who think in IBMerese, compounded by sloppy English.
About the use of language: it is impossible to sharpen a pencil with a blunt axe. It is equally vain to try to do it with ten blunt axes instead.
Besides a mathematical inclination, an exceptionally good mastery of one's native tongue is the most vital asset of a competent programmer.
From EWD498.
I certainly can't speak for Dijkstra, but I think it's impossible to cleanly separate the part where you're doing actual programming from the part where you're interacting with people. Just for example, even when you're working alone, it's crucial that you're able to understand (clearly and unambiguously) notes you wrote down about what to do, the nature of a bug, etc. A good command of English is necessary even when nobody else is involved at all (and, of course, that's unusual except on trivial tasks).
I don't know about causality, but the skill set required to write well overlaps quite a bit with those required for programming: knowing how to plan, being able to keep a myriad of details consistent, being able to make things clear for a future reader, knowing how to organize your thoughts and the resultant product. That isn't to say that a successful author would make a good programmer, but a programmer with good language skills and the same logic/math/deductive skills is probably a better programmer than one with poor language skills -- at least the code has a greater chance of being understandable.
Yes. Strong natural language skills help you to organize your thoughts in a coherent way that can easily be understood by others. That can help improve your code in everything from naming variables, methods, classes, etc., to expressing the contexts of objects in your model. Practices such as pair programming require you to be able to communicate well with your partner in order to write good code. Techniques such as Domain Driving Design emphasize using the domain language of the business in your code. Natural language skills facilitate that. And there is a strong drive in the development industry toward more natural language-like tools, e.g. many of the newer testing tools like rspec, gherkin, etc., are moving toward more natural language-like syntax. One of the things many people like about dynamic languages like Ruby and Python are that the code tends to read more like a natural language.
Let me state what should be the obvious: every healthy person above 12 knows at least one natural language. Moreover, every healthy person above 12 is able to generate and parse natural language a complex and rich language, and express and understand an extremely large set of ideas. In general, people are not likely to be limited in their ability to discuss issues by their language, but by the type of things they experienced and learned.
Having said that, there are several language-related skills that you might have thought about.
Writing style. You mentioned those specifically. Written language is different from spoken language. Way less intuitive. This is one reason people have to get coached in writing through their years in the education system.
Coding doesn't really involve writing. I mean, there's comments, but they can be rather laconic. Of course the work of a programmer usually involves at least some writing of documents, and writing abilities to make a difference there.
Analytical skills. Analytical skills are a complicated (not to say fuzzy) concept. Analytical skills aren't really about language, but insomuch they are taught and tested at all, it's in the context of writing essays.
Analytical skills are obviously very important in programming. I am not sure that these are exactly the same skills required to write a good essay about Euthanasia or whatever, but as was previously suggested, they may be related.
Foreign language. For people whose native language isn't English, a certain command of English may be needed. Not in the coding itself (knowing what "while" means in English isn't really critical to understanding what it does in Java), but because much training and support material is available mainly in English (did anyone mention Stack Overflow?). The English requirement may differ on the country you are in, and the company you work for, though.
Communication Skills. Ahhm. I was never exactly sure what this means exactly. Maybe it's a cultural thing. I do suspect it's less about knowing a language and more about knowing people.
So to some up, Dijkstra is a venerable computer scientist, but I am not sure he knew that much about language.
Programming isn't just about writing code. On any programming project of any size there will be the need for:
initial project proposal documents
design and architectural documents
programmers manual
users manual
training materials
communication with third party suppliers
etc.
On every big project I've worked on I'd guess I spent at least 50% of my time on the English language documents. So yes, an ability to explain and express yourself well is extremely important. Does it lead to writing better code? Once again, I would say yes - the need to provide clear documentation spills over into the need to write better code, itnerfaces et al.

What is the best dictionary for software development terminology?

On stack overflow, I see that there is referred to Wikipedia a lot. However, I'm often not sure whether they are the definite authority for very specific software development related concepts. For example, I have recently looked for definitions of the terms web server/service and RPC/IPC, and the responses I get very often refer to Wikipedia (directly and indirectly).
Hence my question: which sources do you trust the most for definitions of software development jargon?
http://www.google.com
And no, this isn't being tongue-in-cheek.
Personally I used to trust Wikipedia, and I still read it to get an idea about the subject. But definitely books are better choice. Because they not only have a "compressed" explanation but also provide an examples and give broader description. As professors of my university say, don't trust wikipedia, search for an authorized source. For example a huge information about web service technology you can find in the book Building Web Services with Java - Making Sense of XML, SOAP, WSDL, UDDI - 2nd Edition 2005. It contains information you'll never find in Wikipedia or even in Google (Unless you'll find this book using it ;) ).
Hope this helps.
Google and technical & non-technical software development books.
"A Story Culture" may be a useful read for you as you want something other than a dictionary, IMO. You want something with the knowledge and wisdom of the topic rather than simply what does this mean. For example, there are a couple of blog posts about Technical Debt that I really like to use for reference about the subject, one from Steve McConnell and one from Martin Fowler.
While I can generally suggest going to the source for the term, there is something to be said for a term getting overloaded or overused so that it can have little meaning. There are a few folks' blogs that I can say I trust to get some understanding on a subject including Joels and Jeffs, but don't forget that each of us has a brain and we shouldn't be afraid to use it.

Suggestions on starting a child programming [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What languages and tools do you consider a youngster starting out in programming should use in the modern era?
Lots of us started with proprietary Basics and they didn't do all of us long term harm :) but given the experiences you have had since then and your knowledge of the domain now are there better options?
There are related queries to this one such as "Best ways to teach a beginner to program?" and "One piece of advice" about starting adults programming both of which I submitted answers to but children might require a different tool.
Disclosure: it's bloody hard choosing a 'correct' answer to a question like this so who ever has the best score in a few days will get the 'best answer' mark from me based on the communities choice.
I would suggest LEGO Mindstorm, it provides an intuitive drag and drop interface for programming and because it comes with hardware it provides something tangible for a child to grasp. Also, because it is "LEGO" they might think of it as more of a game then a programming exercise.
My day job is in a school, and over the past few years I've seen or taught (or attempted to teach) various children, in various numbers, programming lessons.
Children are all different - some are quick learners, some aren't. In particular, some have better literacy skills than others, and that definitely makes a difference to the speed at which they'll pick up programming. I bet that most of us here, as professional computer programmers and the kind of people who read and post to forums for fun, learnt to read at a pretty young age. For those kinds of children, and if it's your own child who you can teach one-on-one, you could do worse than JavaScript - it has the advantage that you can do real stuff with it right away, and the edit-test cycle is simply hitting "refresh" in the browser. It gets confusing when you start to run in to how JavaScript does everything asynchronously, and is tricky to debug, but for a bright child under close tuition these problems can be overcome.
LEGO Mindstorms is definitely up there at the top of the list. Most schools now super-glue the bricks together to create pre-made models that can't have bits nicked off of them, but this shouldn't be a problem at home. Over on the Times Educational Supplement site (website forum for the UK's weekly teaching newspaper), the "what programming language is best for children?" topic comes up pretty regularly. Lots of recommendations over there for Scratch as an alternative to Mindstorms - bit more freedom than Mindstorms, again probably better for the brighter student who could also be given a soldering iron.
I've found that slower pupils can still have problems with Mindstorms, even though the programming environment is "graphical" - there's still a lot going on on screen, and there's a fair bit to remember (this was an older version, mind - haven't tried the snazzy new one yet). In my experience, the best all-round introduction to programming is probably still LOGO - actually a considerably more powerful language than most people give it credit for. The original Mindstorms book by Seymour Papert (nothing to do with LEGO - they nicked the title of the book for their product), one of the originators of LOGO, is the canonical reference for teaching programming to children as a "thinking skill" and for the concept of Constructionism in learning.
We've had classes of 7 or 8 year-olds programming LOGO. Note that we aren't aiming to make them "software developers", that's a career path they can decide on at some point post-16. At a young age we're trying to get them to think of "computer programming" as just another tool - how to set out a problem to be solved by a computer, in the same way they might use a mind map to help them organise and remember stuff for an exam. No poor child should be sat down and drilled in the minutia and use of a particular language, they should be left to explore and figure stuff out as they like.
I'll second Geoff's suggestions of Phrogram (used to be KPL), and Alice.
My only other suggestion is Lego Mindstorms NXT. The NXT's programming language is drag-and-drop, is very easy to use, and can do some very complicated tasks once you learn it. Also young boys usually like seeing things move. :)
I've used Alice and NXTs with some young kids, and they've taken to it very well.
Two possibilities are:
Scratch - developed at MIT - http://scratch.mit.edu/
and
EToys from the One Laptop per Child fame - http://wiki.laptop.org/go/Squeak
Full disclosure: I'm one of the guys who invented Kid's Programming Language, which is now http://www.Phrogram.com, which others have recommended here. Let me add some programmer-oriented info about it.
It's a code IDE, rather than drag-and-drop, or designer-based. This was intentional on our part - we wanted to make it easy and fun to do real text-based programming, particularly programming games and graphics. This is a fundamental difference between us and Alice and Scratch. Which you pick is a matter of the kid, their age and aptitudes, your goals. Using them serially with the same beginner might be a great way to go - if you do that, I would recommend Scratch, Alice, Phrogram as the order. Phrogram has worked best for 12 years and up, but I know dads with 6 year olds who have taught their kids with it, and I know 10 year olds who have taught themselves with it.
The language is as much like English as we could make it, and is as minimal as we could make it. The secret sauce is in the class-based object heirarchy, which is again as simple, intuitive and English-like as we could make it. The object heirarchy is optimized for games and graphics. 3D models are available, and 2D sprites. Absolute movement using screen coordinates is supported, or relative movement ala LOGO turtles - Forward(x), TurnLeft(y).
The IDE comes with over 100 examples, some language examples (loops), some learning examples (arrays), some fully-functional games and sims (Pong, Missile Command, Game of Life).
To give you a sense of how highly leveraged we made the language and the IDE: with 27 instructions you can fly a 3D spaceship model around a 3D skybox, using your keyboard. The same with a 2D sprite is 12 to 15 instructions.
We are working on a Blade-compatible release of Phrogram that will allow programs to run on the XBox 360. Yeah, the XBox, on your big TV. Nice motivator for getting a kid started? :)
Phrogram includes support for class-based programming, with methods and properties - but that's only encapsulation, not inheritance or polymorphism.
A tutorial and user guide is available,
My own ebook is available at Amazon and other places online, "Learn to Program with Phrogram!," and gets a beginner started by programming the classic Pong.
Phrogram Programming for the Absolute Beginner, by Jerry Lee Ford, Jr., is also available, as a paperback, at Amazon and elsewhere.
For a child, I would go with Alice. Any kid is going to like the drag-and-drop interaction that Alice uses better than trying to remember how to spell and punctuate any programming language. He/She will learn the basic programming structures (conditionals, loops, etc.) and will experience the fun of building an animated program they can show off to other family or friends.
A beginner CS class at the local community college actually uses Alice to teach programming in a language-independent way. It provides a good foundation for moving into programming in a particular language (or a few languages) down the road.
I recently saw a presentation about GreenFoot (a java based learning environment for children). It looked awesome. If I would have kids, I would give it a try
Link to the presentation
It is a very playful environment, where you could start with very basic methods. The kids learn thinking in an object oriented way (you cannot instantiate an animal, but you can instantiate a cat). And the better they get, the more of Java you can uncover for/with them.
I'd go with Scratch, some points regarding it.
It's a graphical programming language. It isn't text based (this might be
positive or negative). It does make it more intuitive and easy for kids (7 and
up).
It's actually highly object. The objects you write these graphical scripts have the code attached to them and can be reused and moved around.
Very Important: quick and impressive results. Kids need to get going fast and get results in order to get hooked.
I'd like to note that although many of us started programing at a young age in basic or logo and because programmer later in life doesn't mean those are good languages to start with. I think that kids today have much better options, like scratch or Alice.
Text based languages (python, ruby, basic, c# or even c) are dependent on external libraries and tools (editors, compilers) while something like Alice or scratch is all inclusive and will teach kids (not aimed at teens) programming concepts. Later they can move on and expand their learning.
Check out Phrogram (formerly KPL) and Alice
I'd say: give the kid a real C64, because that's how I got started. But, today... I'd say Ruby, but Ruby is a bit too chaotic. BASIC would be better in the long run. Processing is easy to learn, and it's basically Java.
The reason I recommend a C64 is because it's BASIC, but you still have to learn certain computer-related things, like the memory model, pixels, characters, character maps, newlines, etc. etc, if you want to do more advanced stuff. Also, if your kid finds it boring, you know his heart really isn't into coding.
I would pitch LOGO. It was something that was taught in my elementary school. It gives nearly immediate feedback, and will teach really basic programming concepts. Moving that little turtle around can be a lot of fun.
For a child, I would go with Alice.
Here is another vote for Alice. My 4 kids have had a ton of fun working with it and learning the basic concepts of programming. Of course to them it's all about socializing with fairies and ogres, but heck the darn legacy system I work on could use some faries and ogres too.
I'd recommend python, because it's so terse and expressive. Seems less likely to frustrate when getting started, but offers plenty of room to learn more advanced concepts as well.
Game Maker might be another approach. You can start simple with easy drag and drop development, and then introduce more advanced programming as you go. The book The Game Maker's Apprentice: Game Development for Beginners has a number of sample games and takes you through the steps required to make them.
I think python is a good alternative; it is a very powerful language also you can easily do a lot of things (not boring at all).
Checkout Squeak developed by Alan Kay who think programming should be taught at early ages.
How old? Lots of us stared with BASIC at some point, but before then, I learned the concepts of stringing commands together, variables, and looping with LOGO. Figuring out how to draw a circle with a triangle that can only go in a straight line and turn was my very first programming accomplishment.
Edit: This question & its answers make me feel old.
Though _why hasn't given it much love in the past year or so, for a while I was really excited about Hackety Hack. I think the key for most new programmers, especially children who are more than apt to losing interest in things, is instantaneous feedback. That was the really wonderful thing about Hackety Hack: a few lines of code, and suddenly you have something in front of you that does something. There are a few similar applications aimed at things like drawing graphics (one of which, I briefly assisted Nathan Weizenbaum on, Scribble!). Kids simply need positive feedback that they're doing something correct on a regular basis, else there's nothing to keep them interested in the task at hand. What I think the future is for teaching children to program is some sort of DSL built on top of a language with friendly syntax (these would include, arguably, Ruby, Python, and Scheme) whose purpose is to provide an intuitive environment for constructing simple games (say, Tic-Tac Toe, or Hangman).
I think you should start them off in C. The sooner they can get the hang of pointers the better.
See Understanding Pointers and Should I learn C.
I think the first question is: what sort of program would it be interesting to create? One of the things that got me started with programming as a kid (in BBC basic and then QBasic) was the ease of writing graphical programs. I could write a couple of lines of code and see my program draw a line on the screen straight away.
The closest I've seen to that sort of simplicity recently are the pygame library for python and Processing, a set of java libraries with an IDE.
I imagine that hacking on web pages would be another good way to get started: that would entail HTML, Javascript (using a library like jQuery), perhaps PHP or something along those lines.
Whatever tools you provide, the crucial thing is for it to be easy to get started straight away. If you have to write twenty lines of correct code and figure out how to invoke the compiler before you see any tangible results, progress is going to be slow.
There are many good suggestions here already. I really agree with Kronikarz. Get a retro computer (or emulator) that you are interested in and teach with that. Why a retro computer? Basic is built in. Making sounds and primitive graphics is a trivial task. The real deal might be better than an emulator because it will be a bit more fascinating to a child who is used to seeing only modern devices.
As I said here, I'd go for Squeakland and the famous Drive a Car example (powered by Squeak).
Smalltalk syntax is simple, which is great for children.
And later as the child evolves, he can learn more complex and even very advanced concepts that are also in Squeak (eg. programing statefull webapps with automated refactoring and automated unit tests!).
And like #cpuguru and #Rotem said, Scratch (also Squeak based) is great too.
I think Java might be a good choice simply because you can make GUIs easily, and see "cool things" happening. For the same reason, maybe any of the .NET languages. I've also heard good things about scripting languages (Ruby and Python, especially) for getting kids to learn how to program.
Well, if they're young and haven't learnt their ABC's you could try them on BF - non of those pesky letters and numbers to deal with.
I'll get me' coat.
Skizz
I would go with what I wish I had known first: a simple MS-DOS box and the integrated assembler (debug). It is great to really learn and understand the basics of talking to a computer.
If that does not scare away a child, then I would go the "next level up" and introduce C. This shouldn't be hard given that the basic concept of pointers, registers and instructions in general are well-understood by then.
However, I am not entirely sure, where to go next. Take the big jump to Lisp, Haskell or similarly abstracted languages or should there be some simple object oriented languages (maybe even C++) be thrown in or would that more hurt than help?
Looking at Alice, I see it is "designed for high school and college students". There appears to be another language/version called Story Telling Alice that is "designed for middle-school students"
Alice Download Page
I think Context Free Art might be a good choice, with output of graphics, it makes it a lot of fun learning about context-free grammar.
Try [Guido van Robot][1]. It's an excellent introduction to robotics, and it's a great way to introduce kids to the programming side of things (vs the "building the robots" side).
Wasn't Smalltalk designed for such a purpose? I think Ruby would be a good choice, as a descendant of Smalltalk.
I know in the first few years of high school we were 'taught' Logo, and strangely, HTML. After that, the progression went to macros in MS Office, followed by basic VBA, followed by Visual Basic.