Generically, in the software world, what's the term one would use to describe the following functionality?
css: #import
php: include
ruby: require
etc.
etc.
Thanks.
There isn't really a generic term for the kinds of things which are described in the question. The terminology depends on the context for things like this. Just as your question suggests, if you are talking about CSS you would probably use the term "import". If you are talking about C or PHP you would probably use the world "include". If you were talking about Groovy or Java you would probably use the word "import".
Related
I was wondering if some programming languages are faster then others if it comes to processing and parsing a HTML page.
My intention is to scan over thousands of HTML forum pages and processing the code looking for specific <div> tags and content.
If there are no real differences what language would you recommend me for such a task?
Well, it depends.
You definitely should take a look at programming languages like node.js, python or php and figure out what works best for you.
I personally would recommend using something like node.js because it's a non-blocking programming language.
I heard from someone that DSL is really powerful in some specific fields. So i want to find out if i can put it into my skill sets.
The first problem came out is What is DSL exactly? After doing some search, it seems Groovy supports DSL very well. Then i go and read Groovy's documents and try it out by myself.
And i got the impression that DSL is just some kind of configuration files consisting of texts, XMLs and you use some tools like Groovy to parse it, it magically become some methods or functions you can invoke. What happened?
I read something, but cannot get it straight. Any Help?
Did you read this? Martin Fowler is an authority on the subject and a great writer. I doubt that anyone will improve on the first paragraph. If you still don't get it, give it some time and re-read the article a few times.
I'd recommend looking into JetBrain's MPS
A book might be overwhelming, but there's a relatively new one available.
And i got the impression that DSL is just some kind of configuration
files consisting of texts, XMLs and you use some tools like Groovy to
parse it, it magically become some methods or functions you can
invoke. What happened?
I don't think your impression is entirely accurate. I'd forget about Groovy and parsing and all the implementation details for now. Focus on the problem that DSL is trying to solve.
A DSL designer tries to come up with a pseudo programming language that an expert, who is unfamilar with programming languages like Groovy or Java or C#, would recognize as a simple language describing they way they solve problems.
The DSL uses terms and concepts familiar to any one knowledgable about that domain.
The DSL shields users from the underlying implementation details so they can focus on how to attack their problems.
A DSL is written for the convenience of business users, not developers.
Keep that in mind and the rest is implementation. Eye on the prize....
A domain specific language (DSL) is a programing language that is not fully featured. The point is that programing in a DSL can be easier than programing in a general purpose language, and be less prone to bugs. The "domain" in "domain specific language" refers to the specific purpose the language will be used for.
For example, the language that a calculator uses with just + - * / and numbers could be called a domain specific language. It has the advantage over a regular programing language in that programs will never segfault, crash, loop forever, etc. Other examples of domains might be web development -- for example, Ur/Web is a DSL for building web applications. SQL is a database domain specific language. etc.
I don't know much about Groovy, but it seems that there are particular tools for using it to create DSLs. Fundamentally, to create a DSL you need to specify a syntax, along with some sort of semantics. How exactly Groovy does this I do not know.
DSL is a language dedicated to a specific domain. For instance, the well-known CSS is a Domain Specific Language serving the look and formatting of a document.
By using Groovy you might create your own DSL focusing on any selected domain - e.g. accounting, telecommunications, banking etc. This means, that the language will use the common terminology of this area meeting the needs of this domain. This language will be easily understood by people of this domain that are not necessarily technical (e.g. accountants). In some times, it focuses on being used by non-programmers. Especially Groovy is a dynamic language with which you can enable end-users to add code scripts dynamically similarly to what Excel does with VB, through configuration files.
You should delve into Martin Fowler's publications if you are interested in this subject, anyway.
I've been reading Code Complete 2. As I am not native english speaker some statements take some time for me to understand. I would like you to describe the difference between these two statements the author made in his book:
You should program into Your Language (programming language).
You shouldn't program in Your Language.
Why in is bad and into is recommended?
As I understand it, it means to think outside of the bounds of your programming language.
So in means you are thinking in terms of the language, so your thinking is limited by the language itself, and the program you write may not be easily translated into some other language if needed.
But into means you think in algorithms, i.e. freely, then translate into your desired language. So you can easily code in any language you know the syntax of.
But as I have not read the book actually, this may be totally wrong per the context.
Program into your language means that you use the language to construct the "missing" pieces - leverage it to do more than it currently does. Things like creating missing data structure, algorithms and ways of accomplishing tasks that are not native to the language.
Program in your language means just that - not trying to leverage it.
I thought the examples given in the book were quite good.
The author provides an example of his own in that part of the book (which unfortunately I don't remember). You can try reading a bit further.
It means that even if the language doesn't support a particularly convenient feature, as you should always think of writing readable, easy to maintain, modular code, you should try to find a way to emulate that feature even if its not enforced by the language, then you would document that, so that other developers who may modify the code stick to the same rule. I can't provide an example right now, but I think is easy to see the rationale.
How does one study open-source libraries code, particularly standard libraries?
The code base is often vast and hard to navigate. How to find some function or class definition?
Do I search through downloaded source files?
Do I need cvs/svn for that?
Maybe web-search?
Should I just know the structure of the standard library?
Is there any reference on it?
Or do some IDEs have such features? Or some other tools?
How to do it effectively without one?
What are the best practices of doing this in any open-source libraries?
Is there any convention of how are sources manipulated on Linux/Unix systems?
What are the differences for specific programming languages?
Broad presentation of the subject is highly encouraged.
I mark this 'community wiki' so everyone can rephrase and expand my awkward formulations!
Update: Probably didn't express the problem clear enough. What I want to, is to view just the source code of some specific library class or function. And the problem is mostly about work organization and usability - how do I navigate in the huge pile of sources to find the thing, maybe there are specific tools or approaches? It feels like there should've long existed some solution(s) for that.
One thing to note is that standard libraries are sometimes (often?) optimized more than is good for most production code.
Because they are widely used, they have to perform well over a wide variety of conditions, and may be full of clever tricks and special logic for corner cases.
Maybe they are not the best thing to study as a beginner.
Just a thought.
Well, I think that it's insane to just site down and read a library's code. My approach is to search whenever I come across the need to implement something by myself and then study the way that it's implemented in those libraries.
And there's also allot of projects/libraries with excellent documentation, which I find more important to read than the code. In Unix based systems you often find valuable information in the man pages.
Wow, that's a big question.
The short answer: it depends.
The long answer:
Some libraries provide documentation while others don't. Standard libraries are usually pretty well documented, whether your chosen implementation of the library includes documentation or not. For instance you may have found an implementation of the c standard library without documentation but the c standard has been around long enough that there are hundreds of good reference books available. Documentation with hyperlinks is a very useful way to learn a new API. In any case the first place I would look is the library's main website
For less well known libraries lacking documentation I find two different approaches very helpful.
First is a doc generator. Nearly every language I know of has one. It basically parses an source tree and creates documentation (usually as html or xml) which can be used to learn a library. Some use specially formatted comments in the code to create more complete documentation. JavaDoc is one good example of this. Doc generators for many other languages borrow from JavaDoc.
Second an IDE with a class browser. These act as a sort of on the fly documentation. Some display just the library's interface. Other's include description comments from the library's source.
Both of these will require access to the libraries source (which will come in handy if you intend actually use a library).
Many of these tools and techniques work equally well for closed/proprietary libraries.
The standard Java libraries' source code is available. For a beginning Java programmer these can be a great read. Especially the Collections framework is a good place to start. Take for instance the implementation of ArrayList and learn how you can implement a resizeable array in Java. Most of the source has even useful comments.
The best parts to read are probably whose purpose you can understand immediately. Start with the easy pieces and try to follow all the steps that are hidden behind that single call you make from your own code.
Something I do from time to time :
apt-get source foo
Then new C++ project (or whatever) in Eclipse and import.
=> Wow ! Browsable ! (use F3)
Does anyone out there know about examples and the theory behind parsers that will take (maybe) an abstract syntax tree and produce code, instead of vice-versa. Mathematically, at least intuitively, I believe the function of code->AST is reversible, but I'm trying to find work/examples of this... besides the usual resources like the Dragon book and such. Any ideas?
Such thing is called a Visitor. Is traverses the tree and does whatever has to be done, for example optimize or generate code.
Our DMS Software Reengineering Toolkit insists on parsers and parser-inverses (called "prettyprinters") as "poker-ante" to mechanical processing (analyzing/transforming) of arbitrary languages. These provide full round-trip: source text to ASTs with captured position information (file/line/column) and comments, and AST to legal source text including regenerating the original token positions ("fidelity printing") or nicely formatted ("prettyprinting") options, including regeneration of the comments.
Parsers are often specified by a combination of grammars and lexical definitions of tokens; these notations are typically compiled into efficient parsing engines, and DMS does that for the "parser" side, as you might expect. Other folks here suggest that a "visitor" is the way to do prettyprinting, and, like assembly code, it is the right way to implement prettyprinting at the lowest level of abstraction. However, DMS prettyprinters are specified in terms of a text-box construction language over grammar terms something like Latex, that enables one to control the placement of the various language elements horizontally, vertically, embedded, spaced, concatenated, laminated, etc. DMS compiles these into efficient low-level visitors (as other answers suggest) that implement the box generation. But like the parser generator, you don't have see all the ugly detail.
DMS has some 30+ sets of these language front ends for a various programming langauge and formal notations, ranging from C++, C, Java, C#, COBOL, etc. to HTML, XML, assembly languages from some machines, temporaral property specifications, specs for composable abstract algebras, etc.
I rather like lewap's response:
find a mathematical way to express a
visitor and you have a dual to the
parser
But you asked for a sample, so try this on for size: Visual Studio contains a UML editor with excellent symmetry. The way both it and the editors are implemented, all constitute views of the model, and editing either modifies the model resulting in all remaining in synch.
Actually, generating code from a parse tree is strictly easier than parsing code, at least in a mathematical sense.
There are many grammars which are ambiguous, that is, there is no unique way to parse them, but a parse tree can always be converted to a string in a unique way, modulo whitespace.
The Dragon book gives a good description of the theory of parsers.
There are theory, working implementations and examples of reversible parsing in Haskell. The library is by Paweł Nowak. Please refer to
https://hackage.haskell.org/package/syntax
as your starting point. You can find the examples at following URLs.
https://hackage.haskell.org/package/syntax-example
https://hackage.haskell.org/package/syntax-example-json
I don't know where to find much about the theory, but boost::spirit 2.0 has both qi (parser) and karma (generator), sharing the same underlying structure and grammar, so it's a practical implementation of the concept.
Documentation on the generator side is still pretty thin (spirit2 was new in Boost 1.38, and is still in beta), but there are a few bits of karma sample code around, and AFAIK the library's in a working state and there are at least some examples available.
In addition to 'Visitor', 'unparser' is another good keyword to web-search for.
That sounds a lot like the back end of a non-optimizing compiler that has it's target language the same as it's source language.
One question would be whether you require the "unparsed" code to be identical to the original, or just functionally equivalent.
For example, would it be OK for the output to use a different indentation style than the original? That information wouldn't normally be stored in the AST because it's not semantically important.
One thing to look at would be automatic code refactoring tools.
I've been doing these forever, and calling them "DeParse".
It only gets tricky if you also want to recapture whitespace and comments. You have to tuck them into the parse tree so you can regenerate them on output.
The "Visitor Pattern" idea is good. But, I should consider "Visitor" pattern as a lineal list pattern, or, as a generic pattern, and add patterns for more specific cases like Lists, Matrices, and Trees.
Look for a "Hierarchical Visitor Pattern" or "Tree Visitor Pattern" on the web.
You have a tree data structure ("Collection") and want to do something with the data, each time you "visit", "iterate" or "read" an item from the tree.
In your case, you have a tree data structure, that represents the result of scanning/parsing some source code. Then you have read each item's data, and transform it into destination code.
There are several "lens languages" that allow bidirection transformation of source code.
It is also possible to implement reversible parsers using definite clause grammars in Prolog. In SWI-Prolog, the phrase/3 predicate converts parse trees into text and vice-versa. This book provides some additional examples of reversible parsing in Prolog.