Why no programming in English? What is the difference between natural languages and programming languages? - language-agnostic

What is the key difference between natural languages (such as English and French) and programming languages like C++ and Perl?
I am familiar with the ambiguity problem, but can't it be solved using an interactive compiler or using a subset of the natural language using a strict grammar but all the time still retaining the essence of the language?
Another issue is context. But lawyers have ways to solve this issue. (This question is not about reducing the programming complexity, it's simply about concise reasons and roadblock in using natural languages for instructing computer.)
Is there any other significant problem besides these two? Or do these two have greater consequences than I mentioned above? Is the interactive solution and lawyers language technically not feasible for programming?

This is an extremely interesting question and in short, yes, there are some very good reasons why we don't use English to write programs.
It's been said before that the greatest gift that computer science has given us is not the ability to talk to computers but now that formal languages exist for describing algorithms we now have even better tools for communicating these ideas to other people. Even if a computer is not involved. Indeed the best software engineers see their jobs primarily as writing software that is readable to other people so as to make maintenance and addition of new features as easy as possible. This is not possible in a language as big and as free form as any natural, spoken language.
Ambiguity
One reason is that of ambiguity. Have you ever looked a menu in a restaurant and seen that with your burger you can get "Coleslaw and fries or salad"? What does this mean? Can I get both coleslaw and fries or the other option is a salad alone? Or do I always get coleslaw and I have to chose between the fries or a salad? English is full of these things.
I used to teach a class on this and an example I liked to use to explain ambiguity was as follows. I asked the students to write a one paragraph story ending with the sentence "Tom asked Chris if he could help him". About half the time the stories written indicated that the student interpreted the sentence as Tom asking for assistance from Chris. The other half of the time people thought Tom was offering to lend Chris a hand.
If you think about it, there are a lot of people who do write programs in English. They're called product managers and the compiler they use is software engineers. The problem here is that a software engineer has to inject a lot of his own understanding of the problem to understand what the description really means. And trust me, there is a lot of back and forth. Even on very simple business requirements I must clarify ambiguities.
Context
I would not agree that lawyers have ways to solve the context problem. We continually have ongoing arguments in courts among, in some cases, some of the most educated people in the country, about the meanings of various laws. Sometimes this involves arguing about the context in which the laws were written long ago. Sometimes in involves applying it to a new, previously non-existent context like the Internet. The fact that we have thousands of lawyers working on disambiguating these issues is proof that it cannot be handled by a simple computer program like a compiler. It's just too hard of a problem.
Conciseness
Another issue is just the ability to be concise. Mathematics long ago invented notations for many different concepts in the maths because it's just easier to read if there's a special syntax that is concise and has a well defined meaning. A mathematician knows what it means when I say "f(x) = 3x+1". It means the same thing as "There is a function called f and it has one argument. The value of applying f to a number is the number that is one more than three times the number given." But the former is a lot easier to read once you've learned the syntax. The same is true for programming languages. Programming languages are specialized to describe computations.
Implementation
Creators of programming languages deliberately create very small languages. These are, in fact, subsets of English also with some extra syntax. The idea of understanding all of English in all of its free form ways and, worse yet, increasing vocabulary is a job for Natural Language Processing (NPL). A very hard job. If you want to be able to assign unambiguous meaning to a program and have the program's behavior never change, you need a well defined syntax and semantics.
The take-away point here is that English is a very big, very flexible language with no formal specification. Programming languages need to have a well-defined syntax and semantics in order for algorithms to have unambiguous and unchanging meaning. Someone could, in fact, write a formal syntax for a subset of English and give it unambiguous meaning. But this would be a huge, huge job.
Its been done
Check out BabelBuster. The idea here was to take C and convert it to and from a very small subset of very rigorous English such that one could write a program in C and then convert it to English. During the DeCSS DVD decryption arguments, the MPAA was trying to get programs that could decrypt their DVDs declared illegal. BabelBuster fought back with a very interesting idea. Create a way to convert English, which is protected under the freedom of speech into working code in C and thus make the point that C code is also just a language which should be protected as such. Therefor one should be able to publish code that cracks DeCSS. It's an interesting piece of work relevant to your question regardless of which side you're on.
The problem with BabelBuster is that you need to write your program in a very, very limited subset of English. But it is possible to do this.
Conclusion
English, like all natural languages, allows us to describe a computation or algorithm but the language is verbose, offers many ways to say the same thing, is dependent on the context of the speaker, and not formally specified. If your goal is to describe computations, you should take English, chose a minimal workable subset in which you can say everything you need. Formally specify what each word in this subset will mean. Then create a few special notations to make it concise to say the things you say just like mathematics did. If you do this, you do this you'll end up with a typical programming language, or something like it.

There are three principal reasons.
First, as Gabe says - people have figured out through trial and error that programming in things that are close to English sentences only forces programmers to type more useless cruft. (And yes, COBOL was explicitly designed to read more "naturally".)
To a programmer,
windows++
is more readable than
You should now increment the number of windows by one.
For example, Tetris is a rather easy game to code. I would be terribly surprised if you managed to make an English explanation that is detailed enough for a computer (remember, computers are dumb, so you have to spell it all out) in less pages than a short novel.
The second reason is that the range of things a computer knows how to do is rather small, so the number of language constructs that are needed for that is also limited. In contrast, natural languages need to be able to express the entirety of human experience, which does require many language constructs to pull off. For example, "According to his wife, John would have caught the fish yesterday if it hadn't rained" is not expressible in C - and does not need to be.
And third is, indeed, ambiguity, as you yourself note. There are a lot of places where a software error is simply not permissible. People do enough bugs in unambiguous languages; allowing ambiguity would be a disaster waiting to happen. And on the same subject, we are still unable to parse human language sufficiently well - state of the art parsers still have unacceptably high error rates.

It is possible to automatically translate structured English into code, as long as a restricted subset of the English language is used.
As a proof-of-concept, I have developed an programming language called EngScript, which translates English sentences sentences into Python source code.
Arithmetic operations can be written in plain English:
#print{3 to the power of 2}
#print{3 raised to the power of 2}
#Both of these statements print "9".
print{3 plus (the sum of 1 and 2)}
#This prints "5".
Variables can be initialized in plain English, too:
let x be (x plus 1)
if (x is not equal to 7) :
print x

Related

What are important languages to learn to understand different approaches and concepts? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
When all you have is a pair of bolt cutters and a bottle of vodka, everything looks like the lock on the door of Wolf Blitzer's boathouse. (Replace that with a hammer and a nail if you don't read xkcd)
I currently program Clojure, Python, Java and PHP, so I am familiar with the C and LISP syntax as well as the whitespace thing. I know imperative, functional, immutable, OOP and a couple type systems and other things. Now I want more!
What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
I don't feel like learning another functional language(Haskell) or another imperative OOP language(Ruby), nor do I want to practice impractical fun languages like Brainfuck.
One very interesting thing I found myself are monoiconic stack based languages like Factor.
Only when I feel I understand most concepts and have answers to all my questions, I want to start thinking about my own toy language to contain all my personal preferences.
Matters of practicality are highly subjective, so I will simply say that learning different language paradigms will only serve to make you a better programmer. What is more practical than that?
Functional, Haskell - I know you said that you didn't want to, but you should really really reconsider. You've gotten some functional exposure with Clojure and even Python, but you've not experienced it to its fullest without Haskell. If you're really against Haskell then good compromises are either ML or OCaml.
Declarative, Datalog - Many people would recommend Prolog in this slot, but I think Datalog is a cleaner example of a declarative language.
Array, J - I've only just discovered J, but I find it to be a stunning language. It will twist your mind into a pretzel. You will thank J for that.
Stack, Factor/Forth - Factor is very powerful and I plan to dig into it ASAP. Forth is the grand-daddy of the Stack languages, and as an added bonus it's simple to implement yourself. There is something to be said about learning through implementation.
Dataflow, Oz - I think the influence of Oz is on the upswing and will only continue to grow in the future.
Prototype-based, JavaScript / Io / Self - Self is the grand-daddy and highly influential on every prototype-based language. This is not the same as class-based OOP and shouldn't be treated as such. Many people come to a prototype language and create an ad-hoc class system, but if your goal is to expand your mind, then I think that is a mistake. Use the language to its full capacity. Read Organizing Programs without Classes for ideas.
Expert System, CLIPS - I always recommend this. If you know Prolog then you will likely have the upper-hand in getting up to speed, but it's a very different language.
Frink - Frink is a general purpose language, but it's famous for its system of unit conversions. I find this language to be very inspiring in its unrelenting drive to be the best at what it does. Plus... it's really fun!
Functional+Optional Types, Qi - You say you've experience with some type systems, but do you have experience with "skinnable* type systems? No one has... but they should. Qi is like Lisp in many ways, but its type system will blow your mind.
Actors+Fault-tolerance, Erlang - Erlang's process model gets a lot of the buzz, but its fault-tolerance and hot-code-swapping mechanisms are game-changing. You will not learn much about FP that you wouldn't learn with Clojure, but its FT features will make you wonder why more languages can't seem to get this right.
Enjoy!
What about Prolog (for unification/backtracking etc), Smalltalk (for "everything's a message"), Forth (reverse polish, threaded interpreters etc), Scheme (continuations)?
Not a language, but the Art of the Metaobject Protocol is mind-bending stuff
I second Haskell. Don't think "I know a Lisp, so I know functional programming". Ever heard of type classes? Algebraic data types? Monads? "Modern" (more or less - at least not 50 years old ;) ) functional languages, especially Haskell, have explored a plethora of very powerful useful new concepts. Type classes add ad-hoc polymorphism, but type inference (yet another thing the languages you already know don't have) works like a charm. Algebraic data types are simply awesome, especially for modelling trees-like data structures, but work fine for enums or simple records, too. And monads... well, let's just say people use them to make exceptions, I/O, parsers, list comprehensions and much more - in purely functional ways!
Also, the whole topic is deep enough to keep one busy for years ;)
I currently program Clojure, Python, Java and PHP [...] What are languages that take a different approach and would be useful for either practical tool choosing or theoretical understanding?
C
There's a lot of C code lying around---it's definitely practical. If you learn C++ too, there's a big lot of more code around (and the leap is short once you know C and Java).
It also gives you (or forces you to have) a great understanding of some theoretical issues; for instance, each running program lives in a 4 GB byte array, in some sense. Pointers in C are really just indices into this array---they're just a different kind of integer. No different in Java, Python, PHP, except hidden beneath a surface layer.
Also, you can write object-oriented code in C, you just have to be a bit manual about vtables and such. Simon Tatham's Portable Puzzle Collection is a great example of fairly accessible object-oriented C code; it's also fairly well designed and well worth a read to a beginner/intermediate C programmer. This is what happens in Haskell too---type classes are in some sense "just another vtable".
Another great thing about C: engaging in Q&A with skilled C programmers will get you a lot of answers that explain C in terms of lower-level constructs, which builds your closer-to-the-iron knowledge base.
I may be missing OP's point---I think I am, judging by the other answers---but I think it might be a useful answer to other people who have a similar question and read this thread.
From Peter Norvig's site:
"Learn at least a half dozen programming languages. Include one language that supports class abstractions (like Java or C++), one that supports functional abstraction (like Lisp or ML), one that supports syntactic abstraction (like Lisp), one that supports declarative specifications (like Prolog or C++ templates), one that supports coroutines (like Icon or Scheme), and one that supports parallelism (like Sisal). "
http://norvig.com/21-days.html
I'm amazed that after 6 months and hundreds of votes, noone has mentioned SQL ...
In the types as theorems / advanced type systems: Coq ( I think Agda comes in this category too).
Coq is a proof assistant embedded into a functional programing language.
You can write mathematical proofs and Coq helps to build a solution.
You can write functions and prove properties about it.
It has dependent types, that alone blew my mind. A simple example:
concatenate: forall (A:Set)(n m:nat), (array A m)->(array A n)->(array A (n+m))
is the signature of a function that concatenates two arrays of size n and m of elements of A and returns an array of size (n+m). It won't compile if the function doesn't return that!
Is based on the calculus of inductive constructions, and it has a solid theory behind it.
I'm not smart enough to understand it all, but I think is worth taking a look, specially if you trend towards type theory.
EDIT: I need to mention: you write a function in Coq and then you can PROVE it is correct for any input, that is amazing!
One of the languages which i am interested for have a very different point of view (including a new vocabulary to define the language elements and a radical diff syntax) is J. Haskell would be the obvious choice for me, although it is a functional lang, cause its type system and other unique features open your mind and makes you rethink you previous knowledge in (functional) programming.
Just like fogus has suggested it to you in his list, I advise you too to look at the language OzML/Mozart
Many paradigms, mainly targetted at concurrency/multi agent programming.
Concerning concurrency, and distributed calculus, the equivalent of Lambda calculus (which is behind functionnal programming) is called the Pi Calculus.
I have only started begining to look at some implementation of the Pi calculus. But they already have enlarged my conceptions of computing.
Pict
Nomadic Pict
FunLoft. (this one is pretty recent, conceived at INRIA)
Dataflow programming, aka flow-based programming is a good step ahead on the road. Some buzzwords: paralell processing, rapid prototyping, visual programming (not as bad as sounds first).
Wikipedia's articles are good:
In computer science, flow-based
programming (FBP) is a programming
paradigm that defines applications as
networks of "black box" processes,
which exchange data across predefined
connections by message passing, where
the connections are specified
externally to the processes. These
black box processes can be reconnected
endlessly to form different
applications without having to be
changed internally. FBP is thus
naturally component-oriented.
http://en.wikipedia.org/wiki/Flow-based_programming
http://en.wikipedia.org/wiki/Dataflow_programming
http://en.wikipedia.org/wiki/Actor_model
Read JPM's book: http://jpaulmorrison.com/fbp/
(We've written a simple implementation in C++ for home automation purposes, and we're very happy with it. Documentation is under construction.)
You've learned a lot of languages. Now is the time to focus on one language, and master it.
perhaps you might want to try LabView for it's visual programming, although it's for engineering purposes.
nevertheless, you seem pretty interested in all that's out there, hence the suggestion
also, you could try the android appinventor for visually building stuff
Bruce A. Tate, taking a page from The Pragmatic Programmer wrote a book on exactly that:
Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages
In the book, he covers Clojure, Haskell, Io, Prolog, Scala, Erlang, and Ruby.
Mercury: http://www.mercury.csse.unimelb.edu.au/
It's a typed Prolog, with uniqueness types and modes (i.e. specifying that the predicate append(X,Y,Z) meaning X appended to Y is Z yields one Z given an X and Y, but can yield multiple X/Ys for a given Z). Also, no cut or other extra-logical predicates.
If you will, it's to Prolog as Haskell is to Lisp.
Programming does not cover the task of programmers.
New things are always interesting, but there are some very cool old stuff.
The first database system was dBaseIII for me, I was spending about a month to write small examples (dBase/FoxPro/Clipper is a table-based db with indexes). Then, at my first workplace, I met MUMPS, and I got headache. I was young and fresh-brained, but it took 2 weeks to understand the MUMPS database model. There was a moment, like in comics: after 2 weeks, a button has been switched on, and the bulb has just lighten up in my mind. MUMPS is natural, low level, and very-very fast. (It's an unbalanced, unformalized btree without types.) Today's trends shows the way back to it: NoSQL, key-value db, multidimensional db - so there are only some steps left, and we reach Mumps.
Here's a presentation about MUMPS's advantages: http://www.slideshare.net/george.james/mumps-the-internet-scale-database-presentation
A short doc on hierarchical db: http://www.cs.pitt.edu/~chang/156/14hier.html
An introduction to MUMPS globals (in MUMPS, local variables, short: locals are the memory variables, and the global variables, short: globals are the "db variables", setting a global variable goes to the disk immediatelly):
http://gradvs1.mgateway.com/download/extreme1.pdf (PDF)
Say you want to write a love poem...
Instead of using a hammer just because there's one already in your hand, learn the proper tools for the task: learn to speak French.
Once you've reached near-native speaking level, you're ready to start your poem.
While learning new languages on an academical level is an interesting hobby, IMHO you can't really learn to use one until you try to apply it to a real world problem. So, rather than looking for a new language to learn, I'd in your place first look for a new things to build, and only then I'd look for the right language to use for that one specific project. First pick the problem, then the tool, not the other way around..
For anyone who hasn't been around since the mid 80's, I'd suggest learning 8-bit BASIC. It's very low-level, very primitive and it's an interesting exercise to program around its holes.
On the same line, I'd pick an HP-41C series calculator (or emulator, although nothing beats real hardware). It's hard to wrap your brain around it, but well worth it. A TI-57 will do, but will be a completely different experience. If you manage to solve second degree equations on a TI-55, you'll be considered a master (it had no conditionals and no branches except a RST, that jumped the program back to step 0).
And last, I'd pick FORTH (it was mentioned before). It has a nice "build your language" Lisp-ish thing, but is much more bare metal. It will teach you why Rails is interesting and when DSLs make sense and you'll have a glipse on what your non-RPN calculator is thinking while you type.
PostScript. It is a rather interesting language as it's stack based, and it's quite practical once you want to put things on paper and you want either to get it done or troubleshoot why isn't it getting done.
Erlang. The intrinsic parallelism gives it a rather unusual feel and you can again learn useful things from that. I'm not so sure about practicality, but it can be useful for some fast prototyping tasks and highly redundant systems.
Try programming GPUs - either CUDA or OpenCL. It's just C/C++ extensions, but the mental model of the architecture is again completely different from the classic approach, and it definitely gets practical once you need to get some real number crunching done.
Erlang, Forth and some embedded work with assembly language. Really; buy an Arduino kit or something similar, and create a polyphonic beep in assembly. You'll really learn something.
There's also anic:
https://code.google.com/p/anic/
From its site:
Faster than C, Safer than Java, Simpler than *sh
anic is the reference implementation compiler for the experimental, high-performance, implicitly parallel, deadlock-free general-purpose dataflow programming language ANI.
It doesn't seem to be under active development anymore, but it seems to have some interesting concepts (and that, after all, is what you seem to be after).
While not meeting your requirement of "different" - I'd wager that Fantom is a language that a professional programmer should look at. By their own admission, the authors of fantom call it a boring language. It merely shores up the most common use cases of Java and C#, with some borrowed closure syntax from ruby and similar newer languages.
And yet it manages to have its own bootstrapped compiler, provide a platform that has a drop in install with no external dependencies, gets packages right - and works on Java, C# and now the Web (via js).
It may not widen your horizons in terms of new ways of programming, but it will certainly show you better ways of programming.
One thing that I see missing from the other answers: languages based on term-rewriting.
You could take a look at Pure - http://code.google.com/p/pure-lang/ .
Mathematica is also rewriting based, although it's not so easy to figure out what's going on, as it's rather closed.
APL, Forth and Assembly.
Have some fun. Pick up a Lego Mindstorm robot kit and CMU's RobotC and write some robotics code. Things happen when you write code that has to "get dirty" and interact with the real world that you cannot possibly learn in any other way. Yes, same language, but a very different perspective.

Why do programming competition contestants use C++ and Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
After competing in and following this year's Google Code Jam competition, I couldn't help but notice the incredible number of [successful] contestants that used C/C++ and Java. The distribution of languages used throughout the competition can be seen here.
After programming in C/C++ for several years, I recently fell in love with Python for its readable/straightforward nature. More recently, I learned functional languages like OCaml, Scheme, and even logic languages like Prolog. These languages certainly have their merits and, in my opinion, can be applied more easily than C++ and Java for certain situations. For example, Scheme's use of call/cc simplifies backtracking (a tool required to answer several problems) and Prolog's logic specification, although inefficient due to its brute-force nature, can drastically simplify (and even automatically solve) certain problems that are difficult to wrap one's brain around.
It is clear that a competition contestant should use the tools that are best suited for the challenge. Even x86 assembly is Turing complete - that doesn't justify solving problems with it. In this case, why are the contestants that use less common languages like Scheme/Lisp, Prolog, and even Python significantly less successful than contestants that use C/C++ and Java? Worded differently, why don't successful contestants use languages that, although may be less mainstream, are arguably better tools for the job?
There are several motivations for my question. Most importantly, I would like to become a better programmer - both in the practical aspect and the competition aspect. After being introduced to such beautiful paradigms like functional and logic programming, it is discouraging to see so many people discard them in favor of C/C++ and Java. It even makes me question my admiration for said paradigms, worrying that I cannot be successful as a Lisp/Scheme/Prolog programmer in a programming competition.
Great question! As someone who has dabbled in programming contests a bit myself, I may have something to say.
[Let's get the standard disclaimer out of the way: contest programming is only loosely related to "programming in the real world", and while it tests algorithmic and problem-solving skills and the ability to come up with fast bug-free working code under time pressure, it does not necessarily correlate with being able to build large software projects, write maintainable code, etc (beyond the fact that well-structured programs are easier to debug).]
Now for some answers:
C++/Java are more common than other languages in the real world as well, so you'd expect to see a higher proportion anywhere. (But it's even higher in the contest population.)
Many of these participants are students, or got into contests as students, and C++/Java are more common "first languages" that students learn. (Undergrad students these days may start with Scheme, Haskell, Python, etc., but high-schoolers (often self-taught) less often.) In fact, many of the Eastern European participants still use Pascal, and are more amazing with it than the rest of us will ever be with any language.
The school- and college-level contests usually use these languages. The International Olympiad in Informatics (IOI) allows only C, C++ and Pascal (or maybe it allows Java now; I haven't kept up), and the ACM Intercollegiate Programming Contest (ACM ICPC) allows only C, C++ and Java. TopCoder allows C++, Java, C# and VB (really :p); and recently, Python. So you could say the "contest ecosystem" has more C++/Java programmers in it. Google Code Jam and IPSC are among the few contests that allow code in any language, actually.
Now the question is, in GCJ where the contestants are free to choose a language, why wouldn't they choose Python or Scheme? The most relevant factor is that these languages are slow. Sure, for most real-world programming they are easily fast enough, but for the tight loops that are often involved in getting a program to run under the n-second limit for all test cases, these languages don't cut it for any of the algorithmically more involved problems. (A problem designed to accept O(n log n) solutions but not Θ(n2) solutions for C/C++ frequently rules out even optimal O(n log n) solutions in slower languages. Even Java used to be given a handicap at USACO; I'm not sure this is still the case.)
Another factor is the libraries: C++ and Java have better libraries for frequently useful algorithms and data structures (e.g. red-black trees, C++'s next_permutation), while Python's libraries (good enough for the real world) are less useful here, and Prolog and Scheme... I don't know about their libraries. This is a relatively minor factor, because these programmers can write their own code when necessary. :-)
General-purpose multi-paradigm languages are more useful for just getting things done within the time constraints of the contest, than languages that force a philosophy or way of doing things on you. This is why Prolog will always remain unpopular, for instance. (General philosophy: some languages are "enabling" languages that let you do anything including shooting yourself in the foot, some are "directing" that force you to do things the right way.) This is also why C++ is three times more popular than Java in the general contest participants, and much more popular among the top contestants. Since code doesn't have to be read by anyone else, it's ok and even useful to have loop macros like FOR(i,n) (less code to type, and more importantly less chance of making a bug when in a hurry). Nothing against Java, there are a few top programmers who use Java too. :-)
Finally, although many of these top programmers may have C++/Java/Pascal as their "first language", they are not good because of their language, so you don't have to despair about that. Many of these same programmers have won contests like the ICFP contest even with intentionally using crazy languages like shell scripts, m4 (used in autoconf), and assembly (the team named "You Can't Spell Awesome Without ASM").
I liked Jerry Coffin's idea of plotting contestants of the Google AI contest, so I took all of the results and plotted them (calculated mean, standard deviation, and then graphed the normal distribution curves in Excel).
With Lua and JS, got this:
Without (there were few contestants, so maybe the results are skewed):
It looks like Java participants did markedly worse than the rest, while Go, Common Lisp, and C are on the better end.
Why we all speak English and not Esperanto? Well, it just happened so. Even though English is inconsistent and bloated and Esperanto is intentionally designed as 'better tool'.
Thus, one reason is a tradition. In most schools programming is still taught in C/C++, Java, Pascal or even Basic. And participate in those contests mostly students, which choose language they know better.
Also, you can notice that most algorithmic books feature psedudocode in style of Pascal or Ada, and very very rarely - Lisp. I don't know why, perhaps also a tradition. Or perhaps it's just not so good for the algorithms.
Another reason would be speed. Although it's not a problem for Google Code Jam, in almost all contests 2x speed gap is a difference between 'Accepted' and 'Time Limit' verdicts.
In other words, if optimal algorithm in C++ runs 10 times faster than in Ruby, it may mean that sub-optimal algorithm in C++ will still be faster than a good one in Ruby. And contest authors usually don't want to allow O(n^2) submissions, if O(n*logn) can be achieved.
First, I'd question your premise [edit: or what I take to be a premise -- that contestants using C++ and Java fare about equally well]. For example, here's what languages were used for the entries that came in the first 100 places and the last 100 places in Google's recent AI contest:
Contestants using C++ and Java did not seem to be anywhere close to equally successful in that contest. Contestants using Python didn't seem to fare particularly well either, though there were considerably fewer of them, weakening any conclusion in that regard.
Second, of course, an awful lot of the explanation (as others have pointed out) is undoubtedly just the number of people who are familiar with each language. There are probably more people taking a course in Java right now than the total number of people who've ever written any Lisp, Scheme or Prolog.
Edit: I think a third possibility is simply versatility. To pick an extreme example, Prolog is very well suited to a few problems, but equally poorly suited to many others. Few people can (or at least do) learn more than one or two languages well enough to use them in a contest, so most people who are interested in such things are likely to choose languages that can work reasonably well for almost anything, rather than attempting to learn a specialized language for every problem that might be chosen.
In nearly all Google Code Jam rounds, more of the higher-performing contestants code in C++.
Below are the language stats from Google Code Jam 2012 Round 1A, 1B, and 1C (listed top to bottom).
The number of contestants in each round are 3,686, 3,281, and 3,189 respectively.
fun question, probably should be community wiki.
Look at number of finalists by countries: http://www.go-hero.net/jam/10/regions. notice number of people from East Europe and Russia. those places have very strong C++ communities, as well as Java, for number of reasons.
look at number languages in qualifiers: http://www.go-hero.net/jam/10/languages/0 and finals: http://www.go-hero.net/jam/10/languages/6. C++ starts out less than half and has 75 percent in finals. either good programmers prefer C++ or C++ makes the programmers. Probably by the time you master C++, other things become trivial.
You are free to draw your own conclusions though.
First of all, as you have pointed C++ and Java are mainstream languages. These automatically means that people who start doing programming competitions will be introduced to them first - by the way who learns Lisp as a first language:) I also participate regularly in such competitions - I use C++ to compete, although my favorite language is Java. It is just I want to practice another language apart from Java - also C++ is a little bit less verbose and runs faster which is important for programming competitions.
Now to my point - people become experts first in mainstream languages. To participate in programming competitions you must have quite a good grasp of the language you are using. You don't have time to search on the internet for stupid things - like forgot a construct. It is just that speed is an important factor there. To use Lisp in a competition, you must be fond of it. I don't think there are such many people out there. Correct me if I am wrong. And honestly the pros you have mentioned like simplifies backtracking: In whatever language backtracking is easy - declare a method and just call it again for every possible outcome. It couldn't be simpler. I haven't felt till now that the language I am using is trying to trip up my feet for programming competitions.
OMG ... People are all going through the Stats and Figures !!
Lets not forget the basics.. These are the only two languages (mostly) which are taught to people in college/schools...!
That might answer the heavy rush!
A vital reason might be that every contests don't support languages like python or prolog. Specially ACM ICPC World Finals support C/C++ and Java. And TopCoder also supports only C++, Java, C#, VB, and now Python. It is natural for the contestants that they will choose one language that is available in every contest. Another reason might be execution speed. And yes, another reason is these are the languages that most of the people learn first.
Big libraries were a selling point for Java in ACM ICPC. It's handy to be able to realize you want some random data structure or algorithm and just pull it out of the standard libraries.
Keep in mind that C++ is not only the majority among all contestants, but as the rounds progress, its percentage just keeps and keeps improving.
I'd say it is true that most of the participants are students (However, since it is an open tournament with chances to a job interview with google, then you have to consider that many who participate are graduated). But the latest rounds are only for people with ton of experience. They are not just students who just learned to code in C++ / Java.
Of course, the student argument also works against languages like LISP and OcaML or ProLog. That is languages, that are used a lot in AI areas but in the mainstream world students are the most likely to be learning and use them.
Big contests other than google's support few languages, but that still wouldn't explain why Pascal or .net are not near the level of Java (As they tend to be equally supported in the major contest events).
A lot of the best coders in these contests know a lot of languages. But they still prefer to use C++ during the rounds it must be for a bigger reason than "learned C++" first.
I would argue against the claim that languages other than C++ or Java are better tools for the job. If direct data says that the finalists are more likely to use C++ and Java it is a direct contradiction to that claim.
Google AI competition data does not actually contradict any premise regarding the code jam. It actually does show that top coders are able to use languages like Common Lisp when it is truly the better tool for the job. If we want to use this data to assume that CLISP is a great tool for AI competitions, then we should also assume that C++ is a great tool for algorithm competitions like GCJ.

Programmers dictionary/lexicon for non native speakers

I'm not an English speaker, and I'm not very good at English. I'm self thought. I have not worked together with others on a common codebase. I don't have any friends who program. I don't work with other programmers (at least nobody who cares about these things).
I guess this might explain some of my problems in finding good unambiguous class names. I have tried to find some sort of "Programmers dictionary" containing words often used and their meanings. When reading others code I have to look up words quite often, and as many use abbreviations this poses an additional challenge.
My very limited vocabulary "forces" me to use bad class names like xxManager, xxProvider, xxWhatever. It's usually less problematic choosing variable and method names.
Other non English people out here: How have you managed to cope with this? Have you studied English so well it's not a problem? Or have you read so much code naming comes natural? Or discussed a lot with English speakers? Found any good websites, articles or other publications? As I've never read anything regarding programming in my own language, I often have more problems trying to find the words in my language...
PS: All other posts I've found was regarding mixing native tongue and English... And I understand this might be a bit off topic and might be closed.
Edit: Some resources from the answers and other stuff I use:
Jargon / The New Hacker's Dictionary
Common design patterns
Google translate
Dictionary
The Jargon file will help with the more obscure references people will give in the industry.
http://catb.org/jargon/html/go01.html
Other than that..finding good names for your variables/classes/etc is hard. Often times, it's harder than actually solving the problem. Here's a good resource for some common design pattern names people like to use: http://en.wikipedia.org/wiki/Design_pattern_%28computer_science%29
Examples:
AbcFactory
XyzBridge
Could be an unorthodox suggestion, but I would recommend studying English more deeply (I am also a non-native speaker).
Expose yourself to as much English as possible! Watch movies, read English fiction, listen to technical podcasts.
Mind you, if you really want to deepen your knowledge of English, you're probably not going to learn a lot watching "Transformers". On the other hand, diving into Ulysses probably is not a good strategy either.
If you're feeling adventurous, you could always get a subscription to the New Yorker magazine. It'll do things to you - yes this is flamebaiting. :P
Other non English people out here:
How have you managed to cope with this?
Good naming in code matters. Using English is the preferred, but if you don't know English very well the result could be counterproductive.
I had a friend who just guessed what the correct name would be and the result was horrible. ie
String employiiNeim; // employeeName
int eich; // age
The problem with English, is that is not pronounced as written ( french have this minor ... ehrm characteristic ) Other languages like Spanish, German, Dutch, and others, do type and pronounce every letter in the word.
This becomes particular relevant when what you are coding are business rules or business models. In this case it is much better to use your native language.
String nombreEmpleado;
int edad;
Way much better, specially when you work with others.
Have you studied English so well it's not a problem?
Yeap, there is no other way, and a lot of practice.
You can study English the same way you study programming languages though. You can have a teacher and attend to a class room and study an hour a day. Or ( what I did ) you can just grab something that is interesting to you and try to understand it. For instance, you have a small document describing something you care, you read blogs or read content here at StackOverflow, you translate a song you like, etc. etc.
All these are study forms. There is no other way, you won't wake up one day and say: "...I know kung fu" I mean, and say: ..."I know English"
Or have you read so much code naming comes natural?
Also helps, but if you don't understand what the code means, you ... well won't make any progress.
You'll learn the programming language, and that will help you to understand English bit better, but won't help you to learn it. That's because when we program we learn the programming language not the native language.
Or discussed a lot with English speakers?
Eerhh..nope. If you have that chance go ahead, it will improve your listening and speaking, but not necessarily your writting.
The most effective way to improve your English vocabulary and grammar is by READING ( reading in your native language also improves your own language btw )
So, I would say, read as much as you can. Use your native language while you gain more confidence, and keep studying.
The English will come with time.
If you can't find the "Programmer's Dictionary" you're looking for, start one. Post a new question: "What entries are missing from this Dictionary for English-as-a-Second-Language-Programmers?" and seed it with 10 or 20 words/definitions you've already discovered. Once posters have suggested enough additions, move it to a a wiki somewhere and keep accepting contributions. You might end up creating a valuable resource.
Documenting your code with excellent prose like your question above will go a long way!
If you stick to common design patterns endemic to the language, platform, and architecture for which you're working with, other engineers should understand your nomenclature fairly easily.
If you are worried about it in terms of naming your own objects, just think of what your native word is for what you want to do, then go get an english language translation dictionary, and use the english language version.
How about using your native language?
Of course (like for me as an Austrian) some letters may not be allowed - but who cares if there is Mörder or Moerder (Murder) in the class name :)
Or (as I do) use a dictionary like dict.cc or something else.
I do - think what the class does - it manages game session (for an example) so it will become GameSessionManager.
Abbreviations are (at least for me) a problem - but what I've learned from other code - event native speakers use different abbreviations.
And if the class is called GameSessionMgr or GameSessionMngr doesn't make a difference.
Your are not writing books or some kind of "english poem" where spelling, grammar and... counts.
You write code - and if you follow "your sepcial rules" - you and others will (after some time) be able to understand you code and class names.
It will come with time and experience. Above all attempt to (like #Mike A says) document things until the code becomes clearer and try to be consistent.
This is an issue that I run into as well, even as a native English speaker. As a programmer, I often find that I need to find a descriptive word for a class, variable, function, etc. I often find myself asking a friend or coworker what verbage they would use by explaining my idea, carefully excluding any words I myself have considered as a possible choice for the class/function/variable name so as not to inhibit their creativity.
It seems to me that the English Language & Usage site proposal over at Area51 is a good place to ask such questions as "What would you call a class (or thing) that does this, this and that, and has properties x, y, and z?

Does knowing a Natural Language well help with Programming?

We all hear that math at least helps a little bit with programming. My question though, does English or other natural language skills help with programming? I know it has to help with technical documentation, but what about actual programming? Are certain constructs in a programming language also there in natural languages? Does knowing how to write a 20 page research paper help with writing a 20k loc programming project?
Dijkstra went so far as to say: "Besides a mathematical inclination, an exceptionally good mastery of one's native tongue is the most vital asset of a competent programmer."
Edit: yes, I'm reasonably certain he was talking about the programming part of the job. Here's a bit more complete quote:
The problems of business administration in general and database management in particular are much too difficult for people who think in IBMerese, compounded by sloppy English.
About the use of language: it is impossible to sharpen a pencil with a blunt axe. It is equally vain to try to do it with ten blunt axes instead.
Besides a mathematical inclination, an exceptionally good mastery of one's native tongue is the most vital asset of a competent programmer.
From EWD498.
I certainly can't speak for Dijkstra, but I think it's impossible to cleanly separate the part where you're doing actual programming from the part where you're interacting with people. Just for example, even when you're working alone, it's crucial that you're able to understand (clearly and unambiguously) notes you wrote down about what to do, the nature of a bug, etc. A good command of English is necessary even when nobody else is involved at all (and, of course, that's unusual except on trivial tasks).
I don't know about causality, but the skill set required to write well overlaps quite a bit with those required for programming: knowing how to plan, being able to keep a myriad of details consistent, being able to make things clear for a future reader, knowing how to organize your thoughts and the resultant product. That isn't to say that a successful author would make a good programmer, but a programmer with good language skills and the same logic/math/deductive skills is probably a better programmer than one with poor language skills -- at least the code has a greater chance of being understandable.
Yes. Strong natural language skills help you to organize your thoughts in a coherent way that can easily be understood by others. That can help improve your code in everything from naming variables, methods, classes, etc., to expressing the contexts of objects in your model. Practices such as pair programming require you to be able to communicate well with your partner in order to write good code. Techniques such as Domain Driving Design emphasize using the domain language of the business in your code. Natural language skills facilitate that. And there is a strong drive in the development industry toward more natural language-like tools, e.g. many of the newer testing tools like rspec, gherkin, etc., are moving toward more natural language-like syntax. One of the things many people like about dynamic languages like Ruby and Python are that the code tends to read more like a natural language.
Let me state what should be the obvious: every healthy person above 12 knows at least one natural language. Moreover, every healthy person above 12 is able to generate and parse natural language a complex and rich language, and express and understand an extremely large set of ideas. In general, people are not likely to be limited in their ability to discuss issues by their language, but by the type of things they experienced and learned.
Having said that, there are several language-related skills that you might have thought about.
Writing style. You mentioned those specifically. Written language is different from spoken language. Way less intuitive. This is one reason people have to get coached in writing through their years in the education system.
Coding doesn't really involve writing. I mean, there's comments, but they can be rather laconic. Of course the work of a programmer usually involves at least some writing of documents, and writing abilities to make a difference there.
Analytical skills. Analytical skills are a complicated (not to say fuzzy) concept. Analytical skills aren't really about language, but insomuch they are taught and tested at all, it's in the context of writing essays.
Analytical skills are obviously very important in programming. I am not sure that these are exactly the same skills required to write a good essay about Euthanasia or whatever, but as was previously suggested, they may be related.
Foreign language. For people whose native language isn't English, a certain command of English may be needed. Not in the coding itself (knowing what "while" means in English isn't really critical to understanding what it does in Java), but because much training and support material is available mainly in English (did anyone mention Stack Overflow?). The English requirement may differ on the country you are in, and the company you work for, though.
Communication Skills. Ahhm. I was never exactly sure what this means exactly. Maybe it's a cultural thing. I do suspect it's less about knowing a language and more about knowing people.
So to some up, Dijkstra is a venerable computer scientist, but I am not sure he knew that much about language.
Programming isn't just about writing code. On any programming project of any size there will be the need for:
initial project proposal documents
design and architectural documents
programmers manual
users manual
training materials
communication with third party suppliers
etc.
On every big project I've worked on I'd guess I spent at least 50% of my time on the English language documents. So yes, an ability to explain and express yourself well is extremely important. Does it lead to writing better code? Once again, I would say yes - the need to provide clear documentation spills over into the need to write better code, itnerfaces et al.

What do you mean by the expressiveness of a programming language?

I see a lot of the word 'expressiveness' when people want to stress one language is better than the other. But I don't see exactly what they mean by it.
Is it the verboseness/succinctness? I mean, if one language can write down something shorter than the other, does that mean expressiveness? Please refer to my other question - Article about code density as a measure of programming language power
Is it the power of the language? Paul Graham says that one language is more powerful than the other language in a sense that one language can do that the other language can't do (for example, LISP can do something with macro that the other language can't do).
Is it just something that makes life easier? Regular expression can be one of the examples.
Is it a different way of solving the same problem: something like SQL to solve the search problem?
What do you think about the expressiveness of a programming language? Can you show the expressiveness using some code?
What's the relationship with the expressiveness and DSL? Do people come up with DSL to get the expressiveness?
Personally, I feel that the "expressiveness" of a language really comes down to how clearly the language constructs can "express" the developer's intentions.
For example, I feel that C# (especially LINQ via C# 3+) is becoming much more expressive. This LINQ statement is a great example:
var results = collection.Where(item => item > 5);
Without knowing the details of the language or the implementation being used, the developer intent is (in my opinion) very clear in the above statement.
I do not think that the verboseness of the language is equal to its expressiveness, however, there is some correlation in place. If a language requires a lot of code in order to express an abstraction, it is less expressive. These are two related, but different, concepts.
The same is true with power - although here a language's features (ie: power) must be complete enough to express the abstraction clearly. Without this, expressiveness will suffer. That being said, a language can be very "powerful" in terms of features, but not necessarily be expressive, if the feature set is difficult to understand.
"Expressiveness" means the ability to say only what you want done:
bad_event = events.find(&:bad)
rather than how you want it done:
i = 0
bad_event = nil
while i < events.size && bad_event.nil?
event = events[i]
if event.bad?
bad_event = event
end
i += 1
end
Among the things that contribute to expressiveness are:
A lack of required syntactic sugar
First-class functions
Garbage collection
Either dynamic typing or type inference
The language core not being slavishly minimalistic
Good functionality in the standard library
To some degree, the expressiveness of any language can be increased by shoving as much "how to do it" off into subroutines/objects as possible so that most of the remaining code is "what to do." The amount of "how to do it" code needed in the most abstract code is one measure of a language's expressiveness: The more the code looks like pseudocode, the more expressive it is of the programmer's intent.
One can also think about the "meta-expressiveness" of a language: How expressive is the language at constructing Domain Specific Languages?
I like Matthias Felleisen's notion of expressive power, which is comparative:
Language A is strictly more expressive than language B if both of the following are true:
Any program written in language B can be rewritten in language A while keeping the essential structure of the program intact.
Some programs written in language A have to be violently restructured in order to be written in language B.
Usually we want to make these comparisons by looking at some kind of "essential core" of a language—for example, maybe we want to consider a dialect of C with only while and not also for and do...while. Or maybe we want to consider a dialect of Perl with only a prefix if form and no unless form. But sometimes these superficial syntactic distinctions are exactly what we mean by "expressive power"; to some programmers it's important to say
die ("found no solutions") unless length(solutions) > 0;
instead of
if (length(solutions) == 0) { die("found no solutions"); }
So you have to establish whether you're asking about expressive power of surface syntax or deeper structure.
The other thing I like about Felleisen's idea is that it admits of the notion of two languages which are definitely different, but neither is more expressive than the other.
You can read a more detailed exposition in the first two pages of his paper On the Expressive Power of Programming Languages. After that comes a lot of pointy-headed theory :-)
If you want an answer that's somewhat theoretical but more rigorous than most, you might want to look around for Matthias Felleisen's On the Expressive Power of Programming Languages. I'm pretty sure a bit of looking around the net will turn up at least a few copies.
If you want a more practical answer of what most people actually mean when they say it, that's, frankly, rather different. At least in my experience, an "expressive" language usually means: "I like the language, but can't cite much (if any) objective support for doing so." Conversely, things like "less expressive", or "not expressive" generally mean: "I don't like the language (or like it less), but can't cite much (if any) objective support for doing so."
"Not expressive" is often similar to a politician accusing another of being "fascist" -- clearly pejorative, but without any meaningful definition of what's supposedly wrong.
One of the big problems stems from a fundamental difference of opinion. There are at least two fundamentally different general ideas that people seem to have about expressiveness:
the ability to express a wide variety of ideas.
the ability to express some specific ideas clearly (and often succinctly).
To consider some extreme examples, assembly language would qualify as highly expressive by the first criteria--you can do essentially anything in assembly language that you can in a higher level language, and you can do some things in assembly language that you can't in essentially any higher level language.
Obviously, assembly language doesn't look nearly so good by the second measure--it typically requires quite a large amount of fairly opaque code to accomplish much. This measure would tend to favor a language like Haskell or APL, to give only a couple of examples.
These two notions of what "expressive" means are frequently close to diametrically opposed. The first tends to favor the "lowest" level languages, while the second tends to favor the "highest" level. At least from what I've seen, most people really start from the language they like, and their definition of "expressive" is basically whatever balance of the two criteria is required for their preferred language to be the "best".
For me, it is the ability of the language to clearly express my logic and ideas through code, in a way that somebody else reading the code can easily figure out what I was thinking when I wrote it.
Wikipedia has a bit about the concept. I myself take it to mean that a language can accomplish more with less (the so called "informal usage" in the Wikipedia article).
I consider JavaScript expressive (though this could be because Douglas Crockford drilled that idea into my noggin) because it can do so much with just a few keywords. For instance, the function keyword is a function, as well as a method, a class, and a lambda.
Some code illustration (leaving out some details for brevity) in JavaScript. It's an event class I wrote:
SJJS.util.Event = (function() {
var _listeners = [];
var _listenerReturns = [];
return {
addDomListener: function(element, eventName, listener) {
},
trigger: function(element, eventName) {
},
removeListener: function(eventlistener) {
}
}
})();
With just function, var, and some curly braces and parentheses I made a static class with methods and private variables.
Generally speaking, with a programming language which is turing complete you can do anything that another turing complete language can do. That being said, some can do it a lot better than other.
I take expressiveness to mean how much you can say easily, and how well / clearly it can be said. The ability to be terse is part of that ( a very powerful and terse language is one like J ). Generally I find that being concise is a good marker of being expressive. If the language can express a complex operation in a simple manner, it's going in the proper direction.
As to the power, expressiveness isn't all the power of a language. While it may be part of it, speed, security, stability, all of those things factor in as well.
example: summation of a list in Common lisp using the loop operator is concise and expressive
(loop for x in list sum x)
Precision, concision, and readability are the primary components in expressiveness.
I always took it to be roughly equivalent to how high-level a language is. If you wanted to try to quantify expressiveness, the units would be something like "machine code instructions per language statement"
A more expressive language might be very good at doing rather a lot of work without writing a lot of code. However, it would probably be more domain-specific and a wee bit slower for some tasks than a less expressive one.
From Wikipedia: In computer science, the expressive power (also called expressiveness or expressivity) of a language is the breadth of ideas that can be represented and communicated in that language. The more expressive a language is, the greater the variety and quantity of ideas it can be used to represent.
So, I agree. "How easy, comprehensive and composable the language for you to express your intents.": I believe, this is the measure of expressiveness.
QUESTION: Is it the verboseness/succinctness? I mean, if one language can write down something shorter than the other, does that mean expressiveness?
No. For example, is Brainfuck language expressive? I don't think so. Look at an Hello World example in Brainfuck:
++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.
Or: hq9plus language. Hello World code:
H
QUESTION: Is it the power of the language? Paul Graham says that one language is more powerful than the other language in a sense that one language can do that the other language can't do (for example, LISP can do something with macro that the other language can't do).
I disagree with Paul. As you see in the above examples, hq9plus language is doing Hello World with one letter: H. Whereas, most of the other languages will do it with much more letters. But, you can create composable and easy to read code with other languages. If hq9plus is doing Hello World with H, does it mean that is it powerful? I believe no.
QUESTION: Is it just something that makes life easier? Regular expression can be one of the examples.
Regexes are great, but sometimes they lose their expressive power. Sometimes, it depends on the programmer.
QUESTION: Is it a different way of solving the same problem: something like SQL to solve the search problem?
Half Yes. SQL is a declarative and very expressive language. Because the underlying engines and technologies can advance and change without you to change your SQL queries. This makes it very expressive. There are many queries have been working for decades and the underlying technology of databases change. But, your queries don't need to. I think this is due to the power of its expressiveness.
I also believe that functional languages are very expressive. Because, you only describe your intent, not your hows, and the underlying technologies can always change and optimize, but, it won't hurt your expressive code.
Example:
// top 10 products with rating higher than 5
return products
.sort(p => p.rating)
.filter(p => p.rating > 5)
.map(p => p.title)
.take(10)
Above program is expressive, it conveys your intents. And, most probably it won't change when the underlying mechanisms change.
Take for example LINQ. It allows you to use functional programming.
Functional programming emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state.
LINQ allows your to express what you want done instead of how to do it. This is a clear example of expressiveness.