Idiom vs. pattern - language-agnostic

In the context of programming, how do idioms differ from patterns?
I use the terms interchangeably and normally follow the most popular way I've heard something called, or the way it was called most recently in the current conversation, e.g. "the copy-swap idiom" and "singleton pattern".
The best difference I can come up with is code which is meant to be copied almost literally is more often called pattern while code meant to be taken less literally is more often called idiom, but such isn't even always true. This doesn't seem to be more than a stylistic or buzzword difference. Does that match your perception of how the terms are used? Is there a semantic difference?

Idioms are language-specific.
Patterns are language-independent design principles, usually written in a "pattern language" (a uniform template) describing things such as the motivating circumstances, pros & cons, related patterns, etc.

When people observing program development from On High (Analysts, consultants, academics, methodology gurus, etc) see developers doing the same thing over and over again in various situations and environments, then the intelligence gained from that observation can be distilled into a Pattern. A pattern is a way of "doing things" with the software tools at hand that represent a common abstraction.
Some examples:
OO programming took global variables away from developers. For those cases where they really still need global variables but need a way to make their use look clean and object oriented, there's the Singleton Pattern.
Sometimes you need to create a new object having one of a variety of possible different types, depending on some circumstances. An ugly way might involve an ever-expanding case statement. The accepted "elegant" way to achieve this in an OO-clean way is via the "Factory" or "Factory Method" pattern.
Sometimes, a lot of developers do things in a certain way but it's a bad way that should be disrecommended. This can be formalized in an antipattern.
Patterns are a high-level way of doing things, and most are language independent. Whether you create your objects with new Object or Object.new is immaterial to the pattern.
Since patterns are something a bit theoretical and formal, there is usually a formal pattern (heh - word overload! let's say "template") for their description. Such a template may include:
Name
Effect achieved
Rationale
Restrictions and Limitations
How to do it
Idioms are something much lower-level, and usually operate at the language level. Example:
*dst++ = *src++
in C copies a data element from src to dst while incrementing the pointers to both; it's usually done in a loop. Obviously, you won't see this idiom in Java or Object Pascal.
while <INFILE> { print chomp; }
is (roughly quoted from memory) a Perl idiom for looping over an input file and printing out all lines in the file. There's a lot of implicit variable use in that statement. Again, you won't see this particular syntax anywhere but in Perl; but an old Perl hacker will take a quick look at the statement and immediately recognize what you're doing.

Contrary to the idea that patterns are language agnostic, both Paul Graham and Peter Norvig have suggested that the need to use a pattern is a sign that your language is missing a feature. (Visitor Pattern is often singled out as the most glaring example of this.)
I generally think the main difference between "patterns" and "idioms" to be one of size. An idiom is something small, like "use an interface for the type of a variable that holds a collection" while Patterns tend to be larger. I think the smallness of idioms does mean that they're more often language specific (the example I just gave was a Java idiom), but I don't think of that as their defining characteristic.

Since if you put 5 programmers in a room they will probably not even agree on what things are patterns, there's no real "right answer" to this.
One opinion that I've heard once and really liked (though can't for the life of me recall the source), is that idioms are things that should probably be in your language or there is some language that has them. Conversely, they are tricks that we use because our language doesn't offer a direct primitive for them. For instance, there's no singleton in Java, but we can mimic it by hiding the constructor and offering a getInstance method.
Patterns, on the other hand, are more language agnostic (though they often refer to a specific paradigm). You may have some infrastructure to support them (e.g., Spring for MVC), but they're not and not going to be language constructs, and yet you could need them in any language from that paradigm.

Related

Meaning of Program into Your Language and Program in Your Language

I've been reading Code Complete 2. As I am not native english speaker some statements take some time for me to understand. I would like you to describe the difference between these two statements the author made in his book:
You should program into Your Language (programming language).
You shouldn't program in Your Language.
Why in is bad and into is recommended?
As I understand it, it means to think outside of the bounds of your programming language.
So in means you are thinking in terms of the language, so your thinking is limited by the language itself, and the program you write may not be easily translated into some other language if needed.
But into means you think in algorithms, i.e. freely, then translate into your desired language. So you can easily code in any language you know the syntax of.
But as I have not read the book actually, this may be totally wrong per the context.
Program into your language means that you use the language to construct the "missing" pieces - leverage it to do more than it currently does. Things like creating missing data structure, algorithms and ways of accomplishing tasks that are not native to the language.
Program in your language means just that - not trying to leverage it.
I thought the examples given in the book were quite good.
The author provides an example of his own in that part of the book (which unfortunately I don't remember). You can try reading a bit further.
It means that even if the language doesn't support a particularly convenient feature, as you should always think of writing readable, easy to maintain, modular code, you should try to find a way to emulate that feature even if its not enforced by the language, then you would document that, so that other developers who may modify the code stick to the same rule. I can't provide an example right now, but I think is easy to see the rationale.

Pros and cons of weak and strong typing

I'm making the transition from Java to PHP/Javascript and discovering all the practical aspects of using a weakly typed language.
As I'm in a position to fully compare the two I'd like to know the pros and cons of each approach. Also, are there any other forms of typing out there?
A weakly dynamically typed programming language (like PHP) made that the programmer's mistakes occur as non-coherent behaviours (for instance, the program gonna display stupid informations).
With a strongly dynamically typed language (like python), the programming mistakes causes error message. It makes the mistakes easier to uncover and diagnosis but in general the program became not usable after the message has been shown.
Finally, with a strongly statically typed language (like Java, Ada, OCaml, Haskell, ...) some mistakes can be uncovered at compile time and hence reduce the risk to provide an bugged program. (but the release occurs later)
Yes. Python uses Dynamic Typing.
Generally it's a matter of personal preference and the role that the architects of a given language's intended use.
PHP (a scripting language) for example makes sense to be weakly typed, as the tasks it generally performs are far less complex, and require less constraints then say a compiled language.
Regarding your final question, Mathematica is said to be "typeless."
High-level, typeless, dynamic language with consistent symbolic syntax and semantics across all data, functions, and interfaces
PHP/javascript can be used to develope better looking UI's than Java. PHP will be having less constraints and easy to learn and execute than java.

How to translate from a language to another?

I don't want an automated solution.
When you have to translate a program from a language to another, what do you do? You prefer to rewrite it from the beginning or copy and paste it and change only what need to be changed?
What's the best choice?
It depends on
the goals (quick hack for one time use? long-lived production project for work?)
the resources I have (how many man-hours? Test suit and/or functional spec for old code? familiarity with both languages?)
most importantly, differences between the languages. Both the conceptual (OO? functional? reflection? control structures?) as well as available libraries.
Please note that this bullet is not as trivial as it seems - this depends in large part on how idiomatic the original program is - as an example, some people write very "C-like" Perl code (e.g. using C control flow and very C++ like OO design) which can be trivially copied to C or C++, and some people write incredibly intricate idiomatic Perl using functional programming, closures and reflective capabilties; which can't be obviously translated into C/C++.
Also, quality of the original code. E.g. good program will have separated business logic, usually expressed in standard configuration and control flow that's easier to directly clone.
E.g. translating from PHP to Perl for a hack job, you can often start out with copying, since many PHP constructs can be 1-to-1 mapped onto equivalent Perl constructs (just take your pick of Perl templating web library). The resulting code won't be GOOD Perl but will be Good Enough for some purposes.
On the other hand, translating, say, LISP code to Java, you're better off just translating the original code into a functionality specification and re-write from scratch. Your example of Python and JavaScript is probably in the same box.
Usually you have two languages that share at least some concepts (e.g. both have OO, and some imperative control structures) and thus you end up with some combination of the two approaches - parts of the code can be "thoughtlessly" translated, parts need to be re-written from scratch.
The more of the second (complete rewrite) approach, the better quality idiomatic and powerful code you end up with.
Normally, a rewrite using the original for inspiration and direction is the best choice.
The way you may do something in one language might very well not be the right way to do it in another.
When it comes to copy-paste, it is rare that you can just do that - languages are different and follow different syntax rules.
Of course, this all depends on the source and destination languages.
With your comment - javascript and python, I would say a rewrite is the best option.
That probably mostly depends on the similarity between the two languages and the meaning of "translation".
For instance, translating a bit of C89 code to C++ might not be so hard considering it should compile out of the box when you copy-paste (C++ is a compatible super-set to C89). I would hardly consider that a "translation", though.
On the other hand, translating Java to Haskell would certainly require a complete rewrite as the language paradygms, even worse than the syntax, are completely different.
Consider Wikipedia's list of programming languages. I'm too lazy to count how many of them are on this list, but let's assume there are 100.
If you want to translate one of them into another, that means that there are at least 100*99 = 9900 possible combinations for translation.
And that's an awful lot. Since most languages are unique, translation is very, very dependent on source and destination language.
Consider this Pascal to C converter. The author states it took him one and a half year to make a good translator for these particular languages. Obviously, this isn't a trivial task.
Depending on your ambitions, you might spend either one day, or many years translating a program from language A to language B.
How long this takes depends on your skill, size of your source code, complexity of languages A and B and their similarity.
As you can see, this isn't a trivial task and is highly dependent on your situation.

Syntactic sugar vs. feature

In C# (and Java) a string is little more than a char array with a stored length and a few methods tacked on. Likewise, (reference vs. value stuff aside) objects are little more than glorified structs with inheritance and interfaces added.
On one level, these additions feel like clear features and enhancements unto themselves. On another level, they feel like a marginal upgrade from the status of "syntactic sugar."
To take this idea further, consider (I may have some details wrong, but the point remains):
transistor
elementary logic gate
compound gate
| |
ALU flip-flop
| | |
| register RAM
| |
CPU
microcode
assembly
C
C++
| |
MSIL JavaScript
C# jQuery
Many times, any single layer of abstraction looks a lot like syntactic sugar but multiple layers of separation feel very removed from each other.
How do you know when something has stopped being syntactic sugar and started being a bona fide feature?
It turns out to be a feature instead of syntactic sugar when it implies a different way of thinking.
You are right when you say objects are in fact glorified structs with methods and inheritance. That, however, is just the implementation detail. What objects allow is to think in a different way. You can relate more easily to real world entities when thinking about objects. The same thing happened when even further back in time, we jumped from using go-to's to procedural programming. Under the hood, the processor still keeps on jmp'ing from OP to OP, but we could think in a different, more black-box, way.
Having said that, in extreme, you can say everything is syntactic sugar, but some of that sugar is a feature when it allows you to think differently.
Syntactic sugar is a feature.
All of software is a giant stack of abstractions built on top of other abstractions. A string may be nothing more than an array of characters, but there are many operations that feel natural on strings, but awkward on character arrays. The goal of all of these abstractions is the same: remove irrelevant details so that the developer can focus on the important parts of the problem.
As you point out, all modern programming languages could be eliminated, and we could go back to working in assembly language. But our productivity would plummet.
I guess people call something syntactic sugar when they feel they get little benefit from it, and a feature when they feel the get a large benefit from it. That makes the distinction very fuzzy, and quite subjective.
When the change provides value? I have coded in assembler. I switched to C and looked at the output from the compiler. It's code was 95+% as good as my hand crafted assembler and it was much easier to write. For me that provided value so I'd say it wasn't sugar.
C++ helps me translate my object oriented thoughts into code. As long as the overhead isn't terribly high then I think it's a feature.
I'm a practical sort. "If I can see it's valuable" is my answer
"Syntactic sugar" is a feature you don't like
It seems that syntactical surgar is a syntax that changes nothing about the abilities of the language, and using a different construct accomplishes exactly the same thing. A String (thinking in Java) is not just syntatical sugar over a char array. A char array is mutable (in content if not in length). You could not make a char array immutable with an existing language feature without a String array.
On the other hand, the plus operator working on Strings is indeed syntatical sugar for using a StringBuilder and calling append.
I would have to say when the same result is cannot be achieved simply by writing different code, with the same type of "time-constraint" as using the syntactical sugar.
My Example would be a Lambda expression, writing a foreach loop doesn't take a lot of effort, but using .Foreach() sure is nice too; versus rewriting the whole HttpRequest class on your own. One is syntactical, one is a feature. Both save time, one in a much bigger way than the other.
Generally the term "syntactic sugar" refers to language features which never allowed a programmer to do something which could not be done before, but rather provided a nice means of expressing something that could already expressed in the language, even if somewhat more awkwardly.
Certain constructs may be unambiguously regarded as syntactic sugar. For example, in VB.NET, code to test for whether two references weren't equal used to require If Not (ref1 Is Ref2) but newer versions of the language allow If ref1 IsNot Ref2. Nothing can be expressed in the new syntax that couldn't be expressed in the old, but the new syntax is cleaner, introduces no ambiguities, and the only reason not to use it would be if code had to be back-compatible with old versions of the language.
Some constructs may be a bit harder to define as sugar. In particular, if a language adds constructs which will work identically to existing construct when used with other types, but will fail compilation with others, such constructs may provide a means of compile-time type verification which did not exist previously. Java generics may generally be viewed in this light. One can add a Cat to an ArrayList<Cat> just as easily as to an ArrayList; what the ArrayList<Cat> adds is a guard to reject Dogs at compile time. Since compile-time constraints don't allow one to write any program that couldn't be written without them, some people may view them as syntactic sugar. On the other hand, even though type verification is performed at compile-time rather than run-time, it might still be viewed as one of the jobs of a program.
Syntactic sugar and language feature are basically describing the same thing, even if syntactic sugar is sometimes used in a pejorative way whereas feature is often associated with deeper changes in the language architecture (introducing lambdas etc.).
But this distinction is very dependent on a individual point of view (and its subjectively felt usefulness).
Regarding language-design aspects and your example with strings and char-arrays, I would say that this should be neither a feature nor sugar, but simply expressible in the languages basic syntax (LOP - language-oriented programming). Generic concepts (typeclasses, metaprogramming etc.) allow you to express many new and useful constructs by yourself without waiting for the language to get a new feature. Just look at Haskell or C++'s metaprogramming capabilities.

runnable pseudocode?

I am attempting to determine prior art for the following idea:
1) user types in some code in a language called (insert_name_here);
2) user chooses a destination language from a list of well-known output candidates (javascript, ruby, perl, python);
3) the processor translates insert_name_here into runnable code in destination language;
4) the processor then runs the code using the relevant system call based on the chosen language
The reason this works is because there is a pre-established 1 to 1 mapping between all language constructs from insert_name_here to all supported destination languages.
(Disclaimer: This obviously does not produce "elegant" code that is well-tailored to the destination language. It simply does a rudimentary translation that is runnable. The purpose is to allow developers to get a quick-and-dirty implementation of algorithms in several different languages for those cases where they do not feel like re-inventing the wheel, but are required for whatever reason to work with a specific language on a specific project.)
Does this already exist?
The .NET CLR is designed such that C++.Net, C#.Net, and VB.Net all compile to the same machine language, and you can "decompile" that CLI back in to any one of those languages.
So yes, I would say it already exists though not exactly as you describe.
There are converters available for different languages. The problem you are going to have is dealing with libraries. While mapping between language statements might be easy, finding mappings between library functions will be very difficult.
I'm not really sure how useful that type of code generator would be. Why would you want to write something in one language and then immediately convert it to something else? I can see the rationale for 4th Gen languages that convert diagrams or models into code but I don't really see the point of your effort.
Yes, a program that transform a program from one representation to another does exist. It's called a "compiler".
And as to your question whether that is always possible: as long as your target language is at least as powerful as the source language, then it is possible. So, if your target language is Turing-complete, then it is always possible, because there can be no language that is more powerful than a Turing-complete language.
However, there does not need to be a dumb 1:1 mapping.
For example: the Microsoft Volta compiler which compiles CIL bytecode to JavaScript sourcecode has a problem: .NET has threads, JavaScript doesn't. But you can implement threads with continuations. Well, JavaScript doesn't have continuations either, but you can implement continuations with exceptions. So, Volta transforms the CIL to CPS and then implements CPS with exceptions. (Newer versions of JavaScript have semi-coroutines in the form of generators; those could also be used, but Volta is intended to work across a wide range of JavaScript versions, including obviously JScript in Internet Explorer.)
This seems a little bizarre. If you're using the term "prior art" in its most common form, you're discussing a potentially patentable idea. If that is the case, you have:
1/ Published the idea, starting the clock running on patent filing - I'm assuming, perhaps incorrectly, that you're based in the U.S. Other jurisdictions may have other rules.
2/ Told the entire planet your idea, which means it's pretty much useless to try and patent it, unless you act very fast.
If you're not thinking about patenting this and were just using the term "prior art" in a laypersons sense, I apologize. I work for a company that takes patents very seriously and it's drilled into us, in great detail, what we're allowed to do with information before filing.
Having said that, patentable ideas must be novel, useful and non-obvious. I would think that your idea would not pass on the third of these since you're describing a language translator which would have the prior art of the many pascal-to-c and fortran-to-c converters out there.
The one glimmer of hope would be the ability of your idea to generate one of multiple output languages (which p2c and f2c don't do) but I think even that would be covered by the likes of cross compilers (such as gcc) which turn source into one of many different object languages.
IBM has a product called Visual Age Generator in which you code in one (proprietary) language and it's converted into COBOL/C/Java/others to run on different target platforms from PCs to the big honkin' System z mainframes, so there's your first problem (thinking about patenting an idea that IBM, the biggest patenter in the world, is already using).
Tons of them. p2c, f2c, and the original implementation s of C++ and Objective C strike me immediately. Beyond that, it's kind of hard to distinguish what you're describing from any compiler, especially for us old guys whose compilers generated ASM code for an intermediate represetation anyway.