Terminology for the opposite of commented out - terminology

When a piece of code is commented we say just that, it's "commented out".
But when it's not commented out, what is that?
Uncommented isn't quite the same. Active?
It's definitely not commented in.
What's the best way to refer to the act of de-commenting out code?

Uncommented is the most common word for that.

I call code which isn't "commented out" simply "code"

Visual Studio has a function called "Comment out the selected lines".
The opposite function is called "Uncomment the selected lines". I use the term "uncommented."

A couple possibilities:
Live code
Legacy code

I wish it were "uncommented in."
I use "revived" or "restored" for code that used to be commented out, but no longer is. I use "live" or "uncommented" for code that's intended to be compiled or executed.

If it was once commented out in the past, then by uncommenting it it will become
Decommented.
(You can recomment it if you want but I won't recommand that..)
If it was never ever commented out, it's just
code.
These are, to me now, the several different (*) states a Live code can have. (Or maybe it's just yoda english.)
*: implicit potential meta- [add your latin-born term here :P].

It's just a piece of un-out-commented, un-disabled, un-erased, un-inactivated, un-unused, un-deprecated, un-obsolete, un-rewritten, un-archived code.
Or, in layman's terms, code.

Think of an editorial document. Its kind of like saying "what is the un-deleted writing called?" Just, the writing.

Although it is widely used to refer to 'live' code, 'Uncommented code' is an ambiguous term.
It could refer to code that was once commented out, but has been edited to allow it to be run, or it could refer to code that has no descriptive comments.

I call commented out code...poor code. If it is to be commented out then just toss it all together. Your version control system should keep track of the various states of your previous work. Rather than commenting it out for others to figure out why it was commented out later doesn't seem to be a good coding practice!

Seeing how many disparate answers exist for this question, there are a few things you should do.
Pick a term and stick with it.
Consistency is most important. It's okay if it's not the best one ever, and you can change your choice if a more obvious term appears in the distant future.
Explicitly describe that term to your teammates.
Don't just say, "it means uncommented, or whatever you call it." Don't just say, "it's the opposite of commented." Tell them that it means code that was previously commented out, and has now had its commenting syntax removed. Tell them that this code is now active and will execute when called. Never assume that your team is smart enough to "just get it" just because they nod when you use the term.
As a subjective answer, I use the term uncommented. It's a bad name for the behavior, but at least it's mildly intuitive. It beats nonsense like unhashed for languages that use the # character for comments.

I honestly do say "commented in". In phrases like "well, it works with that line commented out, so let's comment it back in and re-run the test". "Uncomment" would be more correct, but sounds clunkier. I wouldn't use that expression in formal writing, though, just when talking to my pair.

During a debugging session, I often comment and uncomment lines of code.
From a purely semantic (and yes, anally pedantic) perspective, it's the action that's described by the phrase, not the code.
"Commented Out" is a verb describing the action whereby a statement was turned into a comment to remove it from the code that's acted upon by the language's parser.
Code having been subjected to the opposite action, that of taking a comment and turning it into a statement to include it in the code that the language's parser acts upon, would be "Statemented In". But of course that's ridiculous.
That said, I agree with Andrew that outside of a transient debugging session, before it's committed, code should be removed if it's not used and not left around in comments to confuse things.

The problem is with the term "commented out". It's an abuse of the comment feature of most languages. In C/C++ you should use the preprocessor to conditionally compile your code, perhaps like the following:
#if 0
...
... here is the code that is not in the build
..
#endif
...
... here is the code that is in the build
...
I recall a coding standard in a place I used to work at used "#ifdef NOT_DEF" and a few other symbols in place of "#if 0" to add some semantics to the "commented out" block.
The terms used under this scheme were "included" and "excluded" code, though "included" was usually assumed when talking generally about "the code".
Of course, not all modern languages have such a preprocessing feature, so you're back to abusing the comment feature.

I ended up going with commented code being code under comment and normal code being code not under comment. The action being take code from under comment or bring out from under comment.

We call it 'unremd'. You won't find it in the dictionary but it is the opposite of REM(d) and most importantly has fewer syllables...

Related

Creating function wrappers

I know there are a lot of questions/answers about function wrappers.
Let me add mine.
I have the situation that I have a function, let's call it draw_this($arg1, $arg2) that is being called in, let's say 38 different locations.
As it turned out later, draw_this() only works properly if we add $arg3 as well. So draw_this() now looks like: draw_this($arg1, $arg2, $arg3).
However $arg3 has to be ALWAYS provided. To make the long story a bit shorter, a possible, yet in my opinion rather bad solution would be to create a wrapper function,
"Approach 1":
function draw_this_wrapper($arg1, $arg2){
$arg3 = include($arg3);
draw_this($arg1, $arg2, $arg3);
}
or, even worse, "Approach 2":
draw_this($arg1, $arg2){
$arg3 = include($arg3);
...
}
Why I really dislike those solutions is that they (at least in my opinion) destroy the principle of encapsulation which has too many advantages to get rid of (or am I wrong here as the information has to be provided every single time?). Approach 1 at least 'kinda' keeps the encapsulation. However it removes the ability to see what draw_this() really looks like (as it is called with three instead of just the two arguments.
However, I have the impression I might miss a design principle here, something that I do not know yet on how to properly solve this situation in a good/"clean" way. Can you give me an insight what would be a good solution here? (The main problem is that the function has to be changed at all 38 locations etc.).
Thanks in advance!!
p.s. Sorry, I forgot to add that $arg3 is ALWAYS the same. So it does not change depending on the context. However, everything is done in a procedural environment. Solution? :/
Ok. As I am working in a procedural environment I actually decided to go for "Approach 2". If any major problems occur I might report them here.

Regular Expressions vs XPath when parsing HTML text

I want to parse a HTML text and find special parts. For example a text in 3rd div of 1st row and 2nd column of a table. I have 2 options to parse: Regular Expressions and XPath. What is advantages and disadvantages of each one?
thanks
It somewhat depends on whether you have a complete HTML file of unknown but well-formed content versus having merely a snippet or an expanse of HTML of completely known content which may or may not be well-formed.
There is a difference between editing and parsing, you see.
It is one thing to be editing your own HTML file that you wrote yourself or are otherwise staring right in the face, and you issue the editor command
:100,200s!<br */>!!g
To remove the breaks from lines 200–300.
It is quite another to suck down whatever HTML happens to be at the other end of a URL and then try to make some sense out it, sight unseen.
The first calls for a regex solution — the very one shown above, in fact. To go off writing some massively overengineered behemoth to do a fall parse to set up the entire parse tree just to do the simple edit shown above is quite simply wrong. It’s also its own punishment.
On the other hand, using patterns to parse out (as opposed to lex out) an entire HTML document that can contain all kinds of whacky things you aren’t planning for just cries out for leveraging someone else’s hard work intead of recreating the wheel for yourself, and badly at that.
However, there’s something else nobody likes to mention, and that’s that most people just aren’t competent at regexes. They don’t really understand them. They don’t know how to test them or to craft them. They don’t know how to make them readable and maintainable.
The truth of the matter is that the overwhelming majority of regex users cannot even manage as simple and basic a thing as matching an arbitrary HTML tag using a regex, even when things gotchas like alternate encodings and CDATA sections and redefined entitities and <script> contents and archaic never-seen forms are all safely dispensed with.
It’s not because it’s hard to do; it isn’t, actually. It’s just that the people trying to do it understand neither regexes nor HTML particularly well, and they don’t know they don’t know, and so they get themselves in way over their heads more quickly than they realize. And then they have a complete disaster on their hands.
Plus it’s been done before, and correctly. Might as well learn from someone else’s mistakes for a change, eh? It would probably help to have a few canned regexes at your disposal to go at frequently manipulated things. This is especially useful for editing.
But for a full parse, you really shouldn’t try to embed a full HTML grammar inside your pattern. Honest, you really shouldn’t. Speaking as someone has actually can and has done this, I unlike 99.9999% of the responders here the credibility of actual experience in this area when I advise against it. Sure, I can do it, but I almost never want to, and I certainly don’t want you to try it at home unsupervised. I can’t be held responsible for any damage that might ensue. :)
Sure, this may sound like “Do as I say, not as I do,” but if your level of regex mastery were at a level that allowed you to contemplate such a thing, you would not be asking this question. As I mentioned, almost no one who uses regexes can actually match an arbitrary HTML tag, simple as that is. Given that you need that sort of building block before writing your recursive descent grammar, and given that next to nobody can even manage that simple building block, well...
Given that sad state of affairs, it’s probably best to use regexes for simple edit jobs only, and leave their use for more complete solutions to real regex wizards, for they are subtle and quick to anger. Meaning of course the regexes, not (just) the wizards.
But sure, keep some canned regexes handy for doing simple editing rather than full parsing. That way you won’t be forced to redevise them each time from first principles. I do keep a few of these around, but then I also keep simple frameworks that allow me to edit a particular structural element of the HTML, like the plain text or the tag contents or the link references, etc, and those all use a full parser, letting me then surgically target just the parts I want in complete confidence I haven’t forgotten something.
More as a testament to what is possible than what is advisable, you can see some answers with more, um, “heroic” pattern matching, including recursion,
here,
here,
here,
here,
here, and
here.
Understand that some of those were actually written for the express purpose of showing people why they should not use regexes, because some of them are really quite sophisticated, much moreso than you can expect in nonwizards. That difficulty may chase you away, which is ok, because it was sort of meant to.
But don’t let that stop you from using vi on your HTML files, nor should it scare you away from using its search or substitute commands. Don’t let the perfect be the enemy of the good. Sometimes good enough is exactly what you need, because the perfect would take more investment than it could ever be worth.
Understanding which out of several possible approaches will give you the most bang for your buck is something that takes time to learn, and no one can tell you the answer that works for you. They don’t know your dataset, your requirements, your skillset, your priorities. Therefore any categorical answer is automatically wrong. You have to evaluate these things for yourself.
I think XPath is the primary option for traversing XML-like documents. With RegExp, it will be up to you to handle the different forms of writing a tag (with multiple spaces, double quotes, single quotes, no quotes, in one line, in multi-lines, with inner data, without inner data, etc). With XPath, this is all transparent to you, and it has many features (like accessing a node by index, selecting by attribute values, selecting simblings, and MANY others).
See how powerfull it can be at http://www.w3schools.com/xpath/.
EDIT: See also How do HTML parses work if they're not using regexp?
XPath is less likely to break if the web developer does any minor changes. That would be my choice.
Here is the canonical Stackoverflow explanation for why you should not parse HTML with regex:
RegEx match open tags except XHTML self-contained tags
In general, you cannot parse HTML with regex because regex is not made to parse HTML. Just use XPath.

What is the golden rule for when to split code up into functions?

It's good to split code up into functions and classes for modularity / decoupling, but if you do it too much, you get really fragmented code which is also not good.
What is the golden rule for when to split code up into functions?
It really does depend on the size and scope of your project.
I tend to split things into functions every time there is something repeated, and that repetition generalized/abstracted, following the golden rule of DRY (Do not repeat yourself).
As far as splitting up classes, I tend to follow the Object Oriented Programming mantra of isolating what is the same from what is different, and split up classes if one class is implementing more an one large theoretical "idea" or entity.
But honestly if your code is too fragmented and opaque, you should try considering refactoring into a different approach/paradigm.
If I can give it a good name (better than the code it replaces), it becomes a function
I think you generally have to think of a chunk of codes have a chance of being reuse. It comes with experience to figure that out and plan ahead.
I agree with the answers that concentrate on reusability and eliminating repetition, but I also believe that readability is at least as important. So in addition to repetition, I would look for pieces of code (classes, functions, blocks, etc.) that do more than one thing.
If the name associated with the code does not describe what it does, then that is a good time to refactor that code into units which each have a single responsibility and a descriptive name. This separation of concerns will help both reusability, and readability.
Useful code can stick around for a long time, so it is important that you (or ideally someone else) can go back and easily understand code that was written months or years before.
Probably my own personal rule is if it is more than 2 lines long, and is referenced more than once on the same page (ASP.net), or a few times spread over a couple of pages, than I will write a function to do it.
I was taught that anything you do more than once should be a function. Anything similar should have a parent class, and above all else consult your source code "standards" within your organization. The latter mostly deals with formatting.
First, write the feature you're adding. (Notice the word "First", we tend to write a function/class before writing the feature, which might lead to having too many fragmentation).
Then, review the code you just wrote/changed, find what blocks of the code that is:
3-lines or more, and..
repeated, and..
Can be grouped under a named function. Because code is written by us, people (not programs), to be read/changed later by us, people (not programs).

Rails - Escaping HTML using the h() AND excluding specific tags

I was wondering, and was as of yet, unable to find any answers online, how to accomplish the following.
Let's say I have a string that contains the following:
my_string = "Hello, I am a string."
(in the preview window I see that this is actually formatting in BOLD and ITALIC instead of showing the "strong" and "i" tags)
Now, I would like to make this secure, using the html_escape() (or h()) method/function.
So I'd like to prevent users from inserting any javascript and/or stylesheets, however, I do still want to have the word "Hello" shown in bold, and the word "string" shown in italic.
As far as I can see, the h() method does not take any additional arguments, other than the piece of text itself.
Is there a way to escape only certain html tags, instead of all? Like either White or Black listing tags?
Example of what this might look like, of what I'm trying to say would be:
h(my_string, :except => [:strong, :i]) # => so basically, escape everything, but leave "strong" and "i" tags alone, do not escape these.
Is there any method or way I could accomplish this?
Thanks in advance!
Excluding specific tags is actually pretty hard problem. Especially the script tag can be inserted in very many different ways - detecting them all is very tricky.
If at all possible, don't implement this yourself.
Use the white list plugin or a modified version of it . It's superp!
You can have a look Sanitize as well(Seems better, never tried it though).
Have you considered using RedCloth or BlueCloth instead of actually allowing HTML? These methods provide quite a bit of formatting options and manage parsing for you.
Edit 1: I found this message when browsing around for how to remove HTML using RedCloth, might be of some use. Also, this page shows you how version 2.0.5 allows you to remove HTML. Can't seem to find any newer information, but a forum post found a vulnerability. Hopefully it has been fixed since that was from 2006, but I can't seem to find a RedCloth manual or documentation...
I would second Sanitize for removing HTML tags. It works really well. It removes everything by default and you can specify a whitelist for tags you want to allow.
Preventing XSS attacks is serious business, follow hrnt's and consider that there is probably an order of magnitude more exploits than that possible due to obscure browser quirks. Although html_escape will lock things down pretty tightly, I think it's a mistake to use anything homegrown for this type of thing. You simply need more eyeballs and peer review for any kind of robustness guarantee.
I'm the in the process of evaluating sanitize vs XssTerminate at the moment. I prefer the xss_terminate approach for it's robustness—scrubbing at the model level will be quite reliable in a regular Rails app where all user input goes through ActiveRecord, but Nokogiri and specifically Loofah seem to be a little more peformant, more actively maintained, and definitely more flexible and Ruby-ish.
Update I've just implemented a fork of ActsAsTextiled called ActsAsSanitiled that uses Santize (which has recently been updated to use nokogiri by the way) to guarantee safety and well-formedness of the RedCloth output, all without needing any helpers in your templates.

Ideal user feedback for HTML input

Let's face it: writing proper, standards compliant HTML is quite difficult to do. Writing semantic HTML is even more so, but I don't think it's possible for a computer to figure that out.
So my question to you is what would the "ideal" feedback for a user who entered HTML be? Would it be a W3C validator style list of errors and corresponding line numbers and columns? Would it be a annotated code display of highlighted lines, explanations of the errors, and possible fixes? A spell-check style mode where you handle each error separately? Would it be not giving them any error information at all? Also, what types of errors are a good idea to tell users? (Some broad classes of errors include parsing errors, nesting errors (i.e. putting a div in a b tag) and well-formedness errors.)
Scottm: Good point; I've never liked the W3C way of listing all the errors either. However, there is still the question of then letting the user edit the offending HTML appropriately.
onebyone: Ok, so looking at some screenshots it looks like HTML Validator has a W3C error list, but combined with the ability to go straight to the relevant source segment and expanded error information, as well as the fact that you don't have to go scrolly to jump from one section to another. Looks pretty good, but is it usable by the average Joe?
Edit 1: As a clarification, this is with regards to the interface, not necessarily the underlying implementation. However, interface needs to be feasible with plain HTML and JavaScript (double usability points if it just needs HTML, but I think you're going to get stuck with W3C in that case).
The output from the Firefox "HTML validator" add-on is pretty good. It shows you the source in a big window, and a list of errors in a small window (smallness doesn't matter, since you generally only care about the first one, since you're aiming for a total of none). Click an error to highlight, and an expanded explanation is shown in a second small window, while the offending part of the code is highlighted in the big window.
The add-on doesn't include a text editor, though, so it's not a full solution to your problem. It uses both an SGML-based validator and HTML Tidy, though, and I think for local files you can get it to make the corrections suggested by Tidy.
I always think syntax highlighting is great. In HTML this would be very useful too, as tags can be easily distinguished by the developer when he/she can see them appropraitely coloured.
Personally I don't like the W3C way of giving you a big boring list of problems. Visual aids in the code itself are much better.