Related
Are "procedure" and "function" synonymous in Racket (a dialect of Scheme)? It seems to be implied by the documentation. For example, the documentation for compose describes it as a procedure that
[r]eturns a procedure that composes the given functions...The compose function
allows the given functions to consume and produce any number of
values...
(All of the above italicization was added by me.)
I understand that procedure? is a library procedure and function? is not. My question is whether it is correct to use the terms interchangeably when discussing programs (such as when teaching a class or writing documentation).
TL;Dr It's just lingo and means the same. function, procedure, and static method is the same in programming.
Historically a function is in the mathematical sense a mapping between arguments to a result. A procedure is a block of code that does something and its output does not need to be tied to any specific input. Thus you could say a function is a procedure with no side effects.
The Scheme standard uses only the term procedure. You won't find any hints about function at all. Racket is historically a standard Scheme implementation made for education purposes and is pretty much still compatible with Scheme for the most part today, but they have made a split and does not consider themselves to follow a Scheme standard. How to design programs and lots of documentation uses the term function and in this documentation it is meant as a synonym to procedure.
Common Lisp uses the term function consistently and its predecessors too, which predates Scheme.
I think I have even translated a SO answer between languages and changed the code as well as just switched function and procedure for consistency with the languages lingo itself. I would love to see Racket clean up some day and stay with one name to rule them all.
The short version: yes.
The longer version: a number of folks have done good work on aligning vocabulary for use in teaching. This is the first paper that comes to mind, although it does not specifically address the procedure/function choice:
https://cs.brown.edu/~sk/Publications/Papers/Published/mfk-measur-effect-error-msg-novice-sigcse/paper.pdf
From a pedagogic standpoint, of course, it's unhelpful to have two names for the same thing, sigh.
Finally, you'll get a more authoritative answer (and frankly, I'd like to know what the state of things here is) if you ask this question on the Racket Mailing List.
[EDIT] Ooh, further, I would not at all say that the word procedure is more likely to denote something defined in a library.
Hm, this is language - agnostic, I would prefer doing it in C# or F#, but I'm more interested this time in the question "how would that work anyway".
What I want to accomplish ist:
a) I want to LEARN it - it's about my ego this time, it's for a fun project where I want to show myself that I'm a really good at this stuff
b) I know a tiny little bit about EBNF (although I don't know yet, how operator precedence works in EBNF - Irony.NET does it right, I checked the examples, but this is a bit ominous to me)
c) My parser should be able to take this: 5 * (3 + (2 - 9 * (5 / 7)) + 9) for example and give me the right results
d) To be quite frankly, this seems to be the biggest problem in writing a compiler or even an interpreter for me. I would have no problem generating even 64 bit assembler code (I CAN write assembler manually), but the formula parser...
e) Another thought: even simple computers (like my old Sharp 1246S with only about 2kB of RAM) can do that... it can't be THAT hard, right? And even very, very old programming languages have formula evaluation... BASIC is from 1964 and they already could calculate the kind of formula I presented as an example
f) A few ideas, a few inspirations would be really enough - I just have no clue how to do operator precedence and the parentheses - I DO, however, know that it involves an AST and that many people use a stack
So, what do you think?
You should go learn about Recursive Descent parsers.
Check out a Code Golf exercise in doing just this, 10 different ways:
Code Golf: Mathematical expression evaluator (that respects PEMDAS)
Several of these "golf" solutions are recursive descent parsers just coded in different ways.
You'll find that doing just expression parsing is by far the easiest thing in a compiler. Parsing the rest of the language is harder, but understanding how the code elements interact and how to generate good code is far more difficult.
You may also be interested in how to express a parser using BNF, and how to do something with that BNF. Here's
an example of how to parse and manipulate algebra symbolically with an explicit BNF and an implicit AST as a foundation. This isn't what compilers traditionally do, but the machinery that does is founded deeply in compiler technology.
For a stack-based parser implemented in PHP that uses Djikstra's shunting yard algorithm to convert infix to postfix notation, and with support for functions with varying number of arguments, you can look at the source for the PHPExcel calculation engine
Traditionally formula processors on computers use POSTFIX notation. They use a stack, pop 2 items as operands, pop the third item as the operator, and push the result.
What you want is an INFIX to POSTFIX notation converter which is really quite simple. Once you're in postfix processing is the simplest thing you'll ever do.
If you want to go for an existing solution I can recommend a working, PSR-0 compatible implementation of the shunting yard algorithm: https://github.com/andig/php-shunting-yard/tree/dev.
I'm making a simple compiler for a simple pet language I'm creating and coming from a C background(though I'm writing it in Ruby) I wondered if a preprocessor is necessary.
What do you think? Is a "dumb" preprocessor still necessary in modern languages? Would C#'s conditional compilation capabilities be considered a "preprocessor"? Does every modern language that doesn't include a preprocessor have the utilities necessary to properly replace it? (for instance, the C++ preprocessor is now mostly obsolete(though still depended upon) because of templates.)
C's preprocessing can do really neat things, but if you look at the things it's used for you realize it's often for just adding another level of abstraction.
Preprocessing for different operations on different platforms? It's basically a layer of abstraction for platform independence.
Preprocessing for easily adding complex code? Abstraction because the language isn't generic enough.
Preprocessing for adding extensions into your code? Abstraction because your code / your language isn't flexible enough.
So my answer is: you don't need a preprocessor if your language is high-level enough *. I wouldn't call preprocessing evil or useless, I just say that the more abstract the language gets, the less reason I can think for it needing preprocessing.
* What's high-level enough? That is, of course, entirely subjective.
EDIT: Of course, I'm only really referring to macros. Using preprocessors for interfacing with other code files or for defining constants is evil.
The preprocessor is a cheap method to provide incomplete metaprogramming facilities to a language in an ugly fashion.
Prefer true metaprogramming or Lisp-style macros instead.
A preprocesssor is not necessary. For real metaprogramming, you should have something like MetaML or Template Haskell or hygienic macros à la Scheme. For quick and dirty stuff, if your users absolutely must have it, there's always m4.
However, a modern language should support the equivalent of C's #line directives. Such directives enable the compiler to locate errors in the original source, even when that source is embedded in a parser generator or a lexer generator or a literate program. In other words,
Design your language so as not to need a preprocessor.
Don't bundle your language with a blessed preprocessor.
But if others have their own reasons for using a preprocessor (parser generation is a popular one), provide support for accurate error messages.
I think that preprocessors are a crutch to keep a language with poor expressive power walking.
I have seen so much abuse of preprocessors that I hate them with a passion.
A preprocessor is a separated phase of compilation.
While preprocessing can be useful in some cases, the headaches and bugs it can cause make it a problem.
In C, preprocessor is used mostly for:
Including data - While powerful, the most common use-cases do not need such power, and "import"/"using" stuff(like in Java/C#) is much cleaner to use, and few people need the remaining cases;
Defining constants - Why not just provide a "const" statement
Macros - While C-style macros are very powerful(they can include statements such as returns), they also harm readability. Generics/Templates are cleaner and, while less powerful in a few ways, they are easier to understand.
Conditional compilation - This is possibly the most legitimate use-case for preprocessors, but once again it's painful for readability. Separating platform-specific code in platform-specific source code and using common if statements ends up being better for readability.
So my answer is while powerful, the preprocessor harms readability and/or isn't the best way to deal with some problems. Newer languages tend to consider code maintenance very important, and for those reasons the preprocessor appears to be obsolete.
It's your language so you can build whatever capabilities you want into the language itself, without a need for a preprocessor. I don't think a preprocessor should be necessary, and it adds a layer of complexity and obscurity on top of a language. Most modern languages don't have preprocessors, and in C++ you only use it when you have no other choice.
By the way, I believe D handles conditional compilation without a preprocessor.
It depends on exactly what other features you offer. For example, if I have a const int N, do you offer for me to take N variables? Have N member variables, take an argument to construct all of them? Create N functions? Perform N operations that don't necessarily work in loops (for example, pass N arguments)? N template arguments? Conditional compilation? Constants that aren't integral?
The C preprocessor is so absurdly powerful in the proper hands, you'd need to make a seriously powerful language not to warrant one.
I would say that although you should avoid the pre-processor for most everything you normally do, it's still necessary.
For example, in C++, in order to write a unit-testing library like Catch, a pre-processor is absolutely necessary. They use it in two different ways: One for assertion expansion1, and one for nesting sections in test cases2.
But, the pre-processor shouldn't be abused to do compile-time computations in C++ where const-expressions and template meta-programming can be used.
Sorry, I don't have enough reputation to post more than two links, so I'm putting this here:
github.com/philsquared/Catch/blob/master/docs/assertions.md
github.com/philsquared/Catch/blob/master/docs/test-cases-and-sections.md
A others have pointed out, much of the functionality provided by the C preprocessor exists to compensate for limitations of the C language. For example, #include and inclusion guards exist due to the lack of an import statement, and macros largely exist due to the lack of inline functions and constant declarations.
However, the one feature of the C preprocessor that would still be beneficial in more modern languages is the #line directive, since this supports the use of semantically-rich preprocessors/compilers. An an example, consider yacc, which is a domain-specific-language (DSL) for writing a parser as a collection of BNF grammar rules. A central feature of yacc is that chunks of C code called actions can be embedded within BNF rules. When a BNF rule is used to parse a piece of an input file, an action embedded in that rule will be executed. The yacc compiler generates a C file that implements the BNF-based parser specified in the input file, and any actions that appeared in the input Yacc file are copied to the generated C file, but each action is surrounded by #line directives. This use of #line directives provides two important benefits.
First, if there is a syntax error in an action, then the error message generated by the C compiler can specify that the error occurred in, say, <input-file-to-yacc>, line 42 rather than in <output-file-generated-by-yacc>.c, line 3967.
Second, the location information provided by #line directives is copied into generated object code files created by the C compiler. So if you are using a debugger to investigate a program crash, if the bug that caused the crash originated from an action embedded in a Yacc input file, then the debugger will report the location of that buggy line of source code as being in <input-file-to-yacc>, line 42 rather than in <output-file-generated-by-yacc>.c, line 3967.
The designers of C# and Perl wisely provided a #line directive. Unfortunately, the designers of many other languages (Java being one that springs to mind) neglected to provide a #line directive. Because of this, Yacc-like parser generators for many languages are unable to communicate the source location of embedded actions to compilers (and, therefore, to debuggers).
In the context of programming, how do idioms differ from patterns?
I use the terms interchangeably and normally follow the most popular way I've heard something called, or the way it was called most recently in the current conversation, e.g. "the copy-swap idiom" and "singleton pattern".
The best difference I can come up with is code which is meant to be copied almost literally is more often called pattern while code meant to be taken less literally is more often called idiom, but such isn't even always true. This doesn't seem to be more than a stylistic or buzzword difference. Does that match your perception of how the terms are used? Is there a semantic difference?
Idioms are language-specific.
Patterns are language-independent design principles, usually written in a "pattern language" (a uniform template) describing things such as the motivating circumstances, pros & cons, related patterns, etc.
When people observing program development from On High (Analysts, consultants, academics, methodology gurus, etc) see developers doing the same thing over and over again in various situations and environments, then the intelligence gained from that observation can be distilled into a Pattern. A pattern is a way of "doing things" with the software tools at hand that represent a common abstraction.
Some examples:
OO programming took global variables away from developers. For those cases where they really still need global variables but need a way to make their use look clean and object oriented, there's the Singleton Pattern.
Sometimes you need to create a new object having one of a variety of possible different types, depending on some circumstances. An ugly way might involve an ever-expanding case statement. The accepted "elegant" way to achieve this in an OO-clean way is via the "Factory" or "Factory Method" pattern.
Sometimes, a lot of developers do things in a certain way but it's a bad way that should be disrecommended. This can be formalized in an antipattern.
Patterns are a high-level way of doing things, and most are language independent. Whether you create your objects with new Object or Object.new is immaterial to the pattern.
Since patterns are something a bit theoretical and formal, there is usually a formal pattern (heh - word overload! let's say "template") for their description. Such a template may include:
Name
Effect achieved
Rationale
Restrictions and Limitations
How to do it
Idioms are something much lower-level, and usually operate at the language level. Example:
*dst++ = *src++
in C copies a data element from src to dst while incrementing the pointers to both; it's usually done in a loop. Obviously, you won't see this idiom in Java or Object Pascal.
while <INFILE> { print chomp; }
is (roughly quoted from memory) a Perl idiom for looping over an input file and printing out all lines in the file. There's a lot of implicit variable use in that statement. Again, you won't see this particular syntax anywhere but in Perl; but an old Perl hacker will take a quick look at the statement and immediately recognize what you're doing.
Contrary to the idea that patterns are language agnostic, both Paul Graham and Peter Norvig have suggested that the need to use a pattern is a sign that your language is missing a feature. (Visitor Pattern is often singled out as the most glaring example of this.)
I generally think the main difference between "patterns" and "idioms" to be one of size. An idiom is something small, like "use an interface for the type of a variable that holds a collection" while Patterns tend to be larger. I think the smallness of idioms does mean that they're more often language specific (the example I just gave was a Java idiom), but I don't think of that as their defining characteristic.
Since if you put 5 programmers in a room they will probably not even agree on what things are patterns, there's no real "right answer" to this.
One opinion that I've heard once and really liked (though can't for the life of me recall the source), is that idioms are things that should probably be in your language or there is some language that has them. Conversely, they are tricks that we use because our language doesn't offer a direct primitive for them. For instance, there's no singleton in Java, but we can mimic it by hiding the constructor and offering a getInstance method.
Patterns, on the other hand, are more language agnostic (though they often refer to a specific paradigm). You may have some infrastructure to support them (e.g., Spring for MVC), but they're not and not going to be language constructs, and yet you could need them in any language from that paradigm.
In C# (and Java) a string is little more than a char array with a stored length and a few methods tacked on. Likewise, (reference vs. value stuff aside) objects are little more than glorified structs with inheritance and interfaces added.
On one level, these additions feel like clear features and enhancements unto themselves. On another level, they feel like a marginal upgrade from the status of "syntactic sugar."
To take this idea further, consider (I may have some details wrong, but the point remains):
transistor
elementary logic gate
compound gate
| |
ALU flip-flop
| | |
| register RAM
| |
CPU
microcode
assembly
C
C++
| |
MSIL JavaScript
C# jQuery
Many times, any single layer of abstraction looks a lot like syntactic sugar but multiple layers of separation feel very removed from each other.
How do you know when something has stopped being syntactic sugar and started being a bona fide feature?
It turns out to be a feature instead of syntactic sugar when it implies a different way of thinking.
You are right when you say objects are in fact glorified structs with methods and inheritance. That, however, is just the implementation detail. What objects allow is to think in a different way. You can relate more easily to real world entities when thinking about objects. The same thing happened when even further back in time, we jumped from using go-to's to procedural programming. Under the hood, the processor still keeps on jmp'ing from OP to OP, but we could think in a different, more black-box, way.
Having said that, in extreme, you can say everything is syntactic sugar, but some of that sugar is a feature when it allows you to think differently.
Syntactic sugar is a feature.
All of software is a giant stack of abstractions built on top of other abstractions. A string may be nothing more than an array of characters, but there are many operations that feel natural on strings, but awkward on character arrays. The goal of all of these abstractions is the same: remove irrelevant details so that the developer can focus on the important parts of the problem.
As you point out, all modern programming languages could be eliminated, and we could go back to working in assembly language. But our productivity would plummet.
I guess people call something syntactic sugar when they feel they get little benefit from it, and a feature when they feel the get a large benefit from it. That makes the distinction very fuzzy, and quite subjective.
When the change provides value? I have coded in assembler. I switched to C and looked at the output from the compiler. It's code was 95+% as good as my hand crafted assembler and it was much easier to write. For me that provided value so I'd say it wasn't sugar.
C++ helps me translate my object oriented thoughts into code. As long as the overhead isn't terribly high then I think it's a feature.
I'm a practical sort. "If I can see it's valuable" is my answer
"Syntactic sugar" is a feature you don't like
It seems that syntactical surgar is a syntax that changes nothing about the abilities of the language, and using a different construct accomplishes exactly the same thing. A String (thinking in Java) is not just syntatical sugar over a char array. A char array is mutable (in content if not in length). You could not make a char array immutable with an existing language feature without a String array.
On the other hand, the plus operator working on Strings is indeed syntatical sugar for using a StringBuilder and calling append.
I would have to say when the same result is cannot be achieved simply by writing different code, with the same type of "time-constraint" as using the syntactical sugar.
My Example would be a Lambda expression, writing a foreach loop doesn't take a lot of effort, but using .Foreach() sure is nice too; versus rewriting the whole HttpRequest class on your own. One is syntactical, one is a feature. Both save time, one in a much bigger way than the other.
Generally the term "syntactic sugar" refers to language features which never allowed a programmer to do something which could not be done before, but rather provided a nice means of expressing something that could already expressed in the language, even if somewhat more awkwardly.
Certain constructs may be unambiguously regarded as syntactic sugar. For example, in VB.NET, code to test for whether two references weren't equal used to require If Not (ref1 Is Ref2) but newer versions of the language allow If ref1 IsNot Ref2. Nothing can be expressed in the new syntax that couldn't be expressed in the old, but the new syntax is cleaner, introduces no ambiguities, and the only reason not to use it would be if code had to be back-compatible with old versions of the language.
Some constructs may be a bit harder to define as sugar. In particular, if a language adds constructs which will work identically to existing construct when used with other types, but will fail compilation with others, such constructs may provide a means of compile-time type verification which did not exist previously. Java generics may generally be viewed in this light. One can add a Cat to an ArrayList<Cat> just as easily as to an ArrayList; what the ArrayList<Cat> adds is a guard to reject Dogs at compile time. Since compile-time constraints don't allow one to write any program that couldn't be written without them, some people may view them as syntactic sugar. On the other hand, even though type verification is performed at compile-time rather than run-time, it might still be viewed as one of the jobs of a program.
Syntactic sugar and language feature are basically describing the same thing, even if syntactic sugar is sometimes used in a pejorative way whereas feature is often associated with deeper changes in the language architecture (introducing lambdas etc.).
But this distinction is very dependent on a individual point of view (and its subjectively felt usefulness).
Regarding language-design aspects and your example with strings and char-arrays, I would say that this should be neither a feature nor sugar, but simply expressible in the languages basic syntax (LOP - language-oriented programming). Generic concepts (typeclasses, metaprogramming etc.) allow you to express many new and useful constructs by yourself without waiting for the language to get a new feature. Just look at Haskell or C++'s metaprogramming capabilities.