Meaning of Program into Your Language and Program in Your Language - language-agnostic

I've been reading Code Complete 2. As I am not native english speaker some statements take some time for me to understand. I would like you to describe the difference between these two statements the author made in his book:
You should program into Your Language (programming language).
You shouldn't program in Your Language.
Why in is bad and into is recommended?

As I understand it, it means to think outside of the bounds of your programming language.
So in means you are thinking in terms of the language, so your thinking is limited by the language itself, and the program you write may not be easily translated into some other language if needed.
But into means you think in algorithms, i.e. freely, then translate into your desired language. So you can easily code in any language you know the syntax of.
But as I have not read the book actually, this may be totally wrong per the context.

Program into your language means that you use the language to construct the "missing" pieces - leverage it to do more than it currently does. Things like creating missing data structure, algorithms and ways of accomplishing tasks that are not native to the language.
Program in your language means just that - not trying to leverage it.
I thought the examples given in the book were quite good.

The author provides an example of his own in that part of the book (which unfortunately I don't remember). You can try reading a bit further.
It means that even if the language doesn't support a particularly convenient feature, as you should always think of writing readable, easy to maintain, modular code, you should try to find a way to emulate that feature even if its not enforced by the language, then you would document that, so that other developers who may modify the code stick to the same rule. I can't provide an example right now, but I think is easy to see the rationale.

Related

Pros and cons of weak and strong typing

I'm making the transition from Java to PHP/Javascript and discovering all the practical aspects of using a weakly typed language.
As I'm in a position to fully compare the two I'd like to know the pros and cons of each approach. Also, are there any other forms of typing out there?
A weakly dynamically typed programming language (like PHP) made that the programmer's mistakes occur as non-coherent behaviours (for instance, the program gonna display stupid informations).
With a strongly dynamically typed language (like python), the programming mistakes causes error message. It makes the mistakes easier to uncover and diagnosis but in general the program became not usable after the message has been shown.
Finally, with a strongly statically typed language (like Java, Ada, OCaml, Haskell, ...) some mistakes can be uncovered at compile time and hence reduce the risk to provide an bugged program. (but the release occurs later)
Yes. Python uses Dynamic Typing.
Generally it's a matter of personal preference and the role that the architects of a given language's intended use.
PHP (a scripting language) for example makes sense to be weakly typed, as the tasks it generally performs are far less complex, and require less constraints then say a compiled language.
Regarding your final question, Mathematica is said to be "typeless."
High-level, typeless, dynamic language with consistent symbolic syntax and semantics across all data, functions, and interfaces
PHP/javascript can be used to develope better looking UI's than Java. PHP will be having less constraints and easy to learn and execute than java.

How to translate from a language to another?

I don't want an automated solution.
When you have to translate a program from a language to another, what do you do? You prefer to rewrite it from the beginning or copy and paste it and change only what need to be changed?
What's the best choice?
It depends on
the goals (quick hack for one time use? long-lived production project for work?)
the resources I have (how many man-hours? Test suit and/or functional spec for old code? familiarity with both languages?)
most importantly, differences between the languages. Both the conceptual (OO? functional? reflection? control structures?) as well as available libraries.
Please note that this bullet is not as trivial as it seems - this depends in large part on how idiomatic the original program is - as an example, some people write very "C-like" Perl code (e.g. using C control flow and very C++ like OO design) which can be trivially copied to C or C++, and some people write incredibly intricate idiomatic Perl using functional programming, closures and reflective capabilties; which can't be obviously translated into C/C++.
Also, quality of the original code. E.g. good program will have separated business logic, usually expressed in standard configuration and control flow that's easier to directly clone.
E.g. translating from PHP to Perl for a hack job, you can often start out with copying, since many PHP constructs can be 1-to-1 mapped onto equivalent Perl constructs (just take your pick of Perl templating web library). The resulting code won't be GOOD Perl but will be Good Enough for some purposes.
On the other hand, translating, say, LISP code to Java, you're better off just translating the original code into a functionality specification and re-write from scratch. Your example of Python and JavaScript is probably in the same box.
Usually you have two languages that share at least some concepts (e.g. both have OO, and some imperative control structures) and thus you end up with some combination of the two approaches - parts of the code can be "thoughtlessly" translated, parts need to be re-written from scratch.
The more of the second (complete rewrite) approach, the better quality idiomatic and powerful code you end up with.
Normally, a rewrite using the original for inspiration and direction is the best choice.
The way you may do something in one language might very well not be the right way to do it in another.
When it comes to copy-paste, it is rare that you can just do that - languages are different and follow different syntax rules.
Of course, this all depends on the source and destination languages.
With your comment - javascript and python, I would say a rewrite is the best option.
That probably mostly depends on the similarity between the two languages and the meaning of "translation".
For instance, translating a bit of C89 code to C++ might not be so hard considering it should compile out of the box when you copy-paste (C++ is a compatible super-set to C89). I would hardly consider that a "translation", though.
On the other hand, translating Java to Haskell would certainly require a complete rewrite as the language paradygms, even worse than the syntax, are completely different.
Consider Wikipedia's list of programming languages. I'm too lazy to count how many of them are on this list, but let's assume there are 100.
If you want to translate one of them into another, that means that there are at least 100*99 = 9900 possible combinations for translation.
And that's an awful lot. Since most languages are unique, translation is very, very dependent on source and destination language.
Consider this Pascal to C converter. The author states it took him one and a half year to make a good translator for these particular languages. Obviously, this isn't a trivial task.
Depending on your ambitions, you might spend either one day, or many years translating a program from language A to language B.
How long this takes depends on your skill, size of your source code, complexity of languages A and B and their similarity.
As you can see, this isn't a trivial task and is highly dependent on your situation.

Idiom vs. pattern

In the context of programming, how do idioms differ from patterns?
I use the terms interchangeably and normally follow the most popular way I've heard something called, or the way it was called most recently in the current conversation, e.g. "the copy-swap idiom" and "singleton pattern".
The best difference I can come up with is code which is meant to be copied almost literally is more often called pattern while code meant to be taken less literally is more often called idiom, but such isn't even always true. This doesn't seem to be more than a stylistic or buzzword difference. Does that match your perception of how the terms are used? Is there a semantic difference?
Idioms are language-specific.
Patterns are language-independent design principles, usually written in a "pattern language" (a uniform template) describing things such as the motivating circumstances, pros & cons, related patterns, etc.
When people observing program development from On High (Analysts, consultants, academics, methodology gurus, etc) see developers doing the same thing over and over again in various situations and environments, then the intelligence gained from that observation can be distilled into a Pattern. A pattern is a way of "doing things" with the software tools at hand that represent a common abstraction.
Some examples:
OO programming took global variables away from developers. For those cases where they really still need global variables but need a way to make their use look clean and object oriented, there's the Singleton Pattern.
Sometimes you need to create a new object having one of a variety of possible different types, depending on some circumstances. An ugly way might involve an ever-expanding case statement. The accepted "elegant" way to achieve this in an OO-clean way is via the "Factory" or "Factory Method" pattern.
Sometimes, a lot of developers do things in a certain way but it's a bad way that should be disrecommended. This can be formalized in an antipattern.
Patterns are a high-level way of doing things, and most are language independent. Whether you create your objects with new Object or Object.new is immaterial to the pattern.
Since patterns are something a bit theoretical and formal, there is usually a formal pattern (heh - word overload! let's say "template") for their description. Such a template may include:
Name
Effect achieved
Rationale
Restrictions and Limitations
How to do it
Idioms are something much lower-level, and usually operate at the language level. Example:
*dst++ = *src++
in C copies a data element from src to dst while incrementing the pointers to both; it's usually done in a loop. Obviously, you won't see this idiom in Java or Object Pascal.
while <INFILE> { print chomp; }
is (roughly quoted from memory) a Perl idiom for looping over an input file and printing out all lines in the file. There's a lot of implicit variable use in that statement. Again, you won't see this particular syntax anywhere but in Perl; but an old Perl hacker will take a quick look at the statement and immediately recognize what you're doing.
Contrary to the idea that patterns are language agnostic, both Paul Graham and Peter Norvig have suggested that the need to use a pattern is a sign that your language is missing a feature. (Visitor Pattern is often singled out as the most glaring example of this.)
I generally think the main difference between "patterns" and "idioms" to be one of size. An idiom is something small, like "use an interface for the type of a variable that holds a collection" while Patterns tend to be larger. I think the smallness of idioms does mean that they're more often language specific (the example I just gave was a Java idiom), but I don't think of that as their defining characteristic.
Since if you put 5 programmers in a room they will probably not even agree on what things are patterns, there's no real "right answer" to this.
One opinion that I've heard once and really liked (though can't for the life of me recall the source), is that idioms are things that should probably be in your language or there is some language that has them. Conversely, they are tricks that we use because our language doesn't offer a direct primitive for them. For instance, there's no singleton in Java, but we can mimic it by hiding the constructor and offering a getInstance method.
Patterns, on the other hand, are more language agnostic (though they often refer to a specific paradigm). You may have some infrastructure to support them (e.g., Spring for MVC), but they're not and not going to be language constructs, and yet you could need them in any language from that paradigm.

runnable pseudocode?

I am attempting to determine prior art for the following idea:
1) user types in some code in a language called (insert_name_here);
2) user chooses a destination language from a list of well-known output candidates (javascript, ruby, perl, python);
3) the processor translates insert_name_here into runnable code in destination language;
4) the processor then runs the code using the relevant system call based on the chosen language
The reason this works is because there is a pre-established 1 to 1 mapping between all language constructs from insert_name_here to all supported destination languages.
(Disclaimer: This obviously does not produce "elegant" code that is well-tailored to the destination language. It simply does a rudimentary translation that is runnable. The purpose is to allow developers to get a quick-and-dirty implementation of algorithms in several different languages for those cases where they do not feel like re-inventing the wheel, but are required for whatever reason to work with a specific language on a specific project.)
Does this already exist?
The .NET CLR is designed such that C++.Net, C#.Net, and VB.Net all compile to the same machine language, and you can "decompile" that CLI back in to any one of those languages.
So yes, I would say it already exists though not exactly as you describe.
There are converters available for different languages. The problem you are going to have is dealing with libraries. While mapping between language statements might be easy, finding mappings between library functions will be very difficult.
I'm not really sure how useful that type of code generator would be. Why would you want to write something in one language and then immediately convert it to something else? I can see the rationale for 4th Gen languages that convert diagrams or models into code but I don't really see the point of your effort.
Yes, a program that transform a program from one representation to another does exist. It's called a "compiler".
And as to your question whether that is always possible: as long as your target language is at least as powerful as the source language, then it is possible. So, if your target language is Turing-complete, then it is always possible, because there can be no language that is more powerful than a Turing-complete language.
However, there does not need to be a dumb 1:1 mapping.
For example: the Microsoft Volta compiler which compiles CIL bytecode to JavaScript sourcecode has a problem: .NET has threads, JavaScript doesn't. But you can implement threads with continuations. Well, JavaScript doesn't have continuations either, but you can implement continuations with exceptions. So, Volta transforms the CIL to CPS and then implements CPS with exceptions. (Newer versions of JavaScript have semi-coroutines in the form of generators; those could also be used, but Volta is intended to work across a wide range of JavaScript versions, including obviously JScript in Internet Explorer.)
This seems a little bizarre. If you're using the term "prior art" in its most common form, you're discussing a potentially patentable idea. If that is the case, you have:
1/ Published the idea, starting the clock running on patent filing - I'm assuming, perhaps incorrectly, that you're based in the U.S. Other jurisdictions may have other rules.
2/ Told the entire planet your idea, which means it's pretty much useless to try and patent it, unless you act very fast.
If you're not thinking about patenting this and were just using the term "prior art" in a laypersons sense, I apologize. I work for a company that takes patents very seriously and it's drilled into us, in great detail, what we're allowed to do with information before filing.
Having said that, patentable ideas must be novel, useful and non-obvious. I would think that your idea would not pass on the third of these since you're describing a language translator which would have the prior art of the many pascal-to-c and fortran-to-c converters out there.
The one glimmer of hope would be the ability of your idea to generate one of multiple output languages (which p2c and f2c don't do) but I think even that would be covered by the likes of cross compilers (such as gcc) which turn source into one of many different object languages.
IBM has a product called Visual Age Generator in which you code in one (proprietary) language and it's converted into COBOL/C/Java/others to run on different target platforms from PCs to the big honkin' System z mainframes, so there's your first problem (thinking about patenting an idea that IBM, the biggest patenter in the world, is already using).
Tons of them. p2c, f2c, and the original implementation s of C++ and Objective C strike me immediately. Beyond that, it's kind of hard to distinguish what you're describing from any compiler, especially for us old guys whose compilers generated ASM code for an intermediate represetation anyway.

Theory, examples of reversible parsers?

Does anyone out there know about examples and the theory behind parsers that will take (maybe) an abstract syntax tree and produce code, instead of vice-versa. Mathematically, at least intuitively, I believe the function of code->AST is reversible, but I'm trying to find work/examples of this... besides the usual resources like the Dragon book and such. Any ideas?
Such thing is called a Visitor. Is traverses the tree and does whatever has to be done, for example optimize or generate code.
Our DMS Software Reengineering Toolkit insists on parsers and parser-inverses (called "prettyprinters") as "poker-ante" to mechanical processing (analyzing/transforming) of arbitrary languages. These provide full round-trip: source text to ASTs with captured position information (file/line/column) and comments, and AST to legal source text including regenerating the original token positions ("fidelity printing") or nicely formatted ("prettyprinting") options, including regeneration of the comments.
Parsers are often specified by a combination of grammars and lexical definitions of tokens; these notations are typically compiled into efficient parsing engines, and DMS does that for the "parser" side, as you might expect. Other folks here suggest that a "visitor" is the way to do prettyprinting, and, like assembly code, it is the right way to implement prettyprinting at the lowest level of abstraction. However, DMS prettyprinters are specified in terms of a text-box construction language over grammar terms something like Latex, that enables one to control the placement of the various language elements horizontally, vertically, embedded, spaced, concatenated, laminated, etc. DMS compiles these into efficient low-level visitors (as other answers suggest) that implement the box generation. But like the parser generator, you don't have see all the ugly detail.
DMS has some 30+ sets of these language front ends for a various programming langauge and formal notations, ranging from C++, C, Java, C#, COBOL, etc. to HTML, XML, assembly languages from some machines, temporaral property specifications, specs for composable abstract algebras, etc.
I rather like lewap's response:
find a mathematical way to express a
visitor and you have a dual to the
parser
But you asked for a sample, so try this on for size: Visual Studio contains a UML editor with excellent symmetry. The way both it and the editors are implemented, all constitute views of the model, and editing either modifies the model resulting in all remaining in synch.
Actually, generating code from a parse tree is strictly easier than parsing code, at least in a mathematical sense.
There are many grammars which are ambiguous, that is, there is no unique way to parse them, but a parse tree can always be converted to a string in a unique way, modulo whitespace.
The Dragon book gives a good description of the theory of parsers.
There are theory, working implementations and examples of reversible parsing in Haskell. The library is by Paweł Nowak. Please refer to
https://hackage.haskell.org/package/syntax
as your starting point. You can find the examples at following URLs.
https://hackage.haskell.org/package/syntax-example
https://hackage.haskell.org/package/syntax-example-json
I don't know where to find much about the theory, but boost::spirit 2.0 has both qi (parser) and karma (generator), sharing the same underlying structure and grammar, so it's a practical implementation of the concept.
Documentation on the generator side is still pretty thin (spirit2 was new in Boost 1.38, and is still in beta), but there are a few bits of karma sample code around, and AFAIK the library's in a working state and there are at least some examples available.
In addition to 'Visitor', 'unparser' is another good keyword to web-search for.
That sounds a lot like the back end of a non-optimizing compiler that has it's target language the same as it's source language.
One question would be whether you require the "unparsed" code to be identical to the original, or just functionally equivalent.
For example, would it be OK for the output to use a different indentation style than the original? That information wouldn't normally be stored in the AST because it's not semantically important.
One thing to look at would be automatic code refactoring tools.
I've been doing these forever, and calling them "DeParse".
It only gets tricky if you also want to recapture whitespace and comments. You have to tuck them into the parse tree so you can regenerate them on output.
The "Visitor Pattern" idea is good. But, I should consider "Visitor" pattern as a lineal list pattern, or, as a generic pattern, and add patterns for more specific cases like Lists, Matrices, and Trees.
Look for a "Hierarchical Visitor Pattern" or "Tree Visitor Pattern" on the web.
You have a tree data structure ("Collection") and want to do something with the data, each time you "visit", "iterate" or "read" an item from the tree.
In your case, you have a tree data structure, that represents the result of scanning/parsing some source code. Then you have read each item's data, and transform it into destination code.
There are several "lens languages" that allow bidirection transformation of source code.
It is also possible to implement reversible parsers using definite clause grammars in Prolog. In SWI-Prolog, the phrase/3 predicate converts parse trees into text and vice-versa. This book provides some additional examples of reversible parsing in Prolog.