how short should a function be? - function

I did some searching on here and haven't found anything quite like this, so I'm going to go ahead and ask. This is really more about semantics than an actual programming question. I'm currently writing something in C++ but the language doesn't really matter.
I'm well aware that it's good programming practice to keep your functions/methods as short as possible. Yet how do you really know if a function is too long? Alternately, is it ever possible to break functions down too much?
The first programming language I learned (other than Applesoft BASIC, which doesn't count...) was 6502 assembly language, where speed and optimization is everything. In cases where a few cycle counts screws up the timing of your entire program, it's often better to set a memory location or register directly rather than jump to another subroutine. The former operation might take 3 or 4 cycles, while, altogether, the latter might take two or three times that.
While I realize that nowadays if I were to even mention cycle counts to some programmers they'd just give me a blank look, it's a hard habit to break.
Specifically, let's say (again using C++) we have a private class method that's something like the following:
int Foo::do_stuff(int x) {
this->x = x;
// various other operations on x
this->y = this->x;
}
I've seen some arguments that, at the very least, each set of operations should be its own function. For instance, do_stuff() should in theory be named to set_x(int x), a separate function should be written for the set of operations performed on class member x, and a third function should be written to assign the final value of class member x to class member y. But I've seen other arguments that EVERY operation should have its own function.
To me, this just seems wrong. Again, I'm looking at things from an internal perspective; every method call is pushing an address on the stack, performing its operations, then returning from the subroutine. This just seems like a lot of overhead for something relatively simple.
Is there a best practice for this sort of thing or is it more up to individual judgment?

Since the days of 6502 assembly, two things have happened: Computers have got much faster, and compilers (where appropriate) have become smarter.
Now the advice is to stop spending all your time fretting about the individual cycles, until you are sure that it is a problem. You can spend that time more wisely. If you mention cycle counts to me, I won't look at you blankly because I don't know what they are. I will look at you wondering if you are wasting your effort.
Instead, start thinking about making your functions small enough to be:
understandable,
testable,
re-usable, maybe, where that is appropriate.
If, later, you find some of your code isn't running fast enough, consider how to optimise it.
Note: The optimisation might be to hint to the compiler to move the function inline, so you still get the advantages above, without the performance hit.

The most important thing about deciding where to break up a function is not necessarily how much the function does. It is rather about determining the API of your class.
Suppose we break Foo::do_stuff into Foo::set_x, Foo::twiddle_x, and Foo::set_y. Does it ever make sense to do these operations separately? Will something bad happen if I twiddle x without first setting it? Can I call set_y without calling set_x? By breaking these up into separate methods, even private methods within the same class, you are implying that they are at least potentially separate operations.
If that's not the case, then by all means keep them in one function.

I'm well aware that it's good
programming practice to keep your
functions/methods as short as possible
I wouldn't use the above criteria to refactor my functions to smaller ones. Below is what I use
Keep all the functions at same level
of abstraction
make sure there are
no side effects (for exceptional
cases make sure to explicitly
document them)
Make sure a function is not doing more than one thing (SRP Principle). But you can break this to honor 1.
Other good practices for Method Design
Don't Make the Client Do Anything the
Module Could Do
Don't Violate the Principle of Least
Astonishment
Fail Fast–Report Errors as Soon as
Possible After They Occur
Overload With Care
Use Appropriate Parameter and Return Types
Use Consistent Parameter Ordering
Across Methods
Avoid Long Parameter Lists
Avoid Return Values that Demand
Exceptional Processing

After reading Clean Code: A Handbook of Agile Software Craftsmanship which touched on almost every piece of advice on this page I started writing shorter. Not for the sake of having short functions but to improve readability and testability, keep them at the same level of abstraction, have them do one thing only, etc.
What I've found most rewarding is that I find myself writing a lot less documentation because it's just not necessary for 80% of my functions. When the function does only what its name says, there's no point in restating the obvious. Writing tests also becomes easier because each test method can set up less and perform fewer assertions. When I revisit code I've written over the past six months with these goals in mind, I can more quickly make the change I want and move on.

i think in any code base, readability is much greater a concern than a few more clock cycles for all the well-known reasons, foremost maintainability, which is where most code spends its time, and as a result where most money is spent on the code. so im kind of dodging your question by saying that people dont concern themselves with it because its negligible in comparison to other [more corporate] concerns.
although if we put other concerns aside, there are explicit inline statements, and more importantly, compiler optimisations that can often remove most of the overhead involved with trivial function calls. the compiler is an incredibly smart optimising machine, and will often organise function calls much more intelligently than we could.

Related

Do we need to avoid duplicate at any cost?

In our application framework ,whenever a code duplicate starts happen (even in slightest degree) people immediately wanted it to be removed and added as separate function. Though it is good ,when it is being done at the early stage of development soon that function need to be changed in an unexpected way. Then they add if condition and other condition to make that function stay.
Is duplicate really evil? I feel we can wait for triplication (duplicate in three places ) to occur before making it as a function. But i am being seen as alien if i suggest this. Do we need to avoid duplicate at any cost ?
There is a quote from Donald Knuth that pin points your question:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
I think that duplication is needed so we are able to understand what are the critical points that need to be DRYed.
Duplicated code is something that should be avoided but there are some (less frequent) contexts where abstractions may lead you to poor code, for example when writing test specs some degree of repetition may keep your code easier to maintain.
I guess you should always ask what is the gain of another abstraction and if there is no quick win you may leave some duplicated code and add TODO comment, this will make you code faster, deliver earlier and DRY only what is most valued.
A similar approach is described by Three Strikes And You Refactor:
The first time you do something, you just do it.
The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway.
(Or if you're like me, you don't wince just yet -- because your DuplicationRefactoringThreshold is 3 or 4.)
The third time you do something similar, you refactor.
This is the holy war between management, developers and QC.
So to make this war end you need to make a conman concept for all three of them.
Abstraction VS Encapsulation, Interface VS Class, Singleton VS Static.
you all need to give yourself time to understand each other point of view to make this work.
For example if you are using scrum then you can go sprints for developing then make a sprint for abstraction and encapsulation.

When is it time to refactor code?

On on hand:
1. You never get time to do it.
2. "Context switching" is mentally expensive (difficult to leave what you're doing in the middle of it).
3. It usually isn't an easy task.
4. There's always the fear you'll break something that's now working.
On the other:
1. Using that code is error-prone.
2. Over time you might realize that if you had refactored the code the first time you saw it, that would have saved you time on the long run.
So my question is - Practically - When do you decide it's time to refactor your code?
Thanks.
A couple of observations:
On on hand:
1. You never got time to do it.
If you treat re-factoring as something separate from coding (instead of an intrinsic part of coding decently), and if you can't manage time, then yeah, you'll never have time for it.
"Context switching" is mentally expensive (difficult to leave what you're doing in the middle of it).
See previous point above. Refactoring is an active component of good coding practices. If you separate the two as if they were two different tasks, then 1) your coding practices need improvement/maturing, and 2) you will engage in severe context switching if your code is in a severe need of refactoring (again, code quality.)
It's usually isn't an easy task.
Only if the code you produce is not amenable to refactoring. That is, code that is hard to refactor exhibits one or more of the following (list is not universally inclusive):
High cyclomatic complexity,
No single responsibility per class (or procedure),
High coupling and/or poor low cohesion (aka poor LCOM metrics),
poor structure
Not following the SOLID principles.
No adherence to the Law of Demeter when appropriate.
Excessive adherence to the Law of Demeter when inappropriate.
Programming against implementations instead of interfaces.
There's always the fear you'll break something that's now working.
Testing? Verification? Analysis? Any of these before being checked into source control (and certainly before being delivered to the users)?
On the other:
1. Using that code is error-prone.
Only if it has never tested/verified and/or if there is no clear understanding of the conditions and usage patterns under which the potentially error-prone code operates acceptably.
Over time you might realize that if you would have refactored the code the
first time you saw it - That would have save you time on the long run.
That realization should not occur over time. Good engineering and work ethics calls for that realization to occur when the artifact (being hardware or software) is in the making.
So my question is - Practically - When do you decide it's time to refactor your code?
Practically, when I'm coding; I detect an area that needs improvement (or something that needs correction after a change on requirements or expectations); and I get an opportunity to improve it without sacrificing a deadline. If I cannot re-factor at that moment, I simply document the perceived defect and create a workable, realistic plan to revisit the artifact for refactoring.
In real life, there will be moments that we'll code some ugly kludge just to get things running, or because we are drained and tired or whatever. It's reality. Our job is to make sure that those incidents do not pile up and remain unattended. And the key to this is to refactor as you code, keep the code simple and with a good, simple and elegant structure. And by "elegant" I don't mean "smart-ass" or esoteric, but that displays what is typically considered readable, simple, composable attributes (and mathematical attributes when they apply practically.)
Good code lends itself to refactoring; it displays good metrics; its structure resembles both computer science function composition and mathematical function composition; it has a clear responsibility; it makes its invariants, pre and post-conditions evident; and so on and so on.
Hope it helps.
One of the most common mistakes i see is people associating the word "Refactor" with "Big Change".
Refactoring code does not always have to be big. Even small changes such as changing a bool to a proper enum, or renaming a method to be closer to the actual function vs. the intent is refactoring your code. With the exception of the end of a milestone, I try to make at least a very small refactoring every single time I check in. And you'd be amazed at how quickly this makes a visible difference in the code.
Bigger changes do take bigger planning though. I try and schedule about 1/2 a day every two weeks during a normal development cycle to tackle a bigger refactoring change. This is enough time to make a substantial improvement to the code base. If the refactoring fails 1/2 a day is not that much of a loss. And it's rarely a total loss because even the failed refactoring will teach you something about your code.
Whenever it smells, I refactor. I may not make it perfect now, but I can at least make a small step towards a better state. And those small changes do add up over time...
If I am in the middle of something when I notice the smell, and fixing it isn't trivial (or I am just before release), I may make a (mental or paper) note to return to it once I am finished with the primary task.
Practice makes one better :-) But if I don't see a solution to a problem, I put it aside and let it brew for a while, discuss it with coworkers, or even post it on SO ;-)
If I don't have unit tests and the fix isn't trivial, I start with the tests. If the tests aren't trivial either, I apply point 2.
I start to refactor as soon as I find I am repeating my self. DRY principles ftw.
Also, if methods/functions get too long, to the point where they look unwieldy, or their purpose is being obscured by the length of the function, I break it into private subfunctions that explain what is really going on.
Lastly, if everything's up and running, and the code is dog-slow, I start to look at refactoring for the sake of performance.
When implementing a new feature I often notice that the task would be much simpler if the code I'm working on was structured in a different way. In this case I usually step back, try to do the refactoring first, and only after this is done I continue implementing the new feature.
I also have a habit to track all potential improvements that come to my mind either in notes or the bug tracker. The ideas bake there for some time, some of them don't feel so compelling anymore, and the reasonable ones are implemented during a day which I dedicate to smaller tasks.
Refactor code when it needs to be refactored. Some symptoms I look for:
duplicate code in similar objects.
duplicate code in within methods of one object.
anytime the requirements have changed twice or more.
anytime somebody says "we will clean that up later".
any time I read through code and shake my head thinking "what goofball did this" (even when the goofball in question is me)
In general, less design and/or less clear requirements means more oppurtunities for refactoring.
This might sound like a joke, but really, I only refactor when things "get messy". When a simple task starts taking more time and usual, when I have to twist my mind around to remember what function is doing what and such. Also, if the code starts running slow and it's not because I'm running in a development enviroment (a lot of variable outputs and such) if I can't optimise it, I refactor the code. As you said, it's worthed on the long run.
Still, I allways make sure I have enough time to think things through before I start so I don't get in this sittuation.
Cheers!
I usually refactor when one of the following is true:
I have nothing better to do and waiting for the next project to come to my inbox
The additions/changes I'm making to the code cannot work unless, or would be better if, I refactor
I am aesthetically displeased with the way the code is laid out
Martin Fowler in his book if the same name suggests you do it the third time you're in a block of code to make another change. First time in the block, you happen to notice you should refactor, but don't have time. Second time back...same thing. Third time back-now refactor.
Also, I read the developers of a current release of smalltalk (squeak.org, I think) say they go through a couple weeks of intense coding...then they step back and look at what can be refactored.
Personally I have to resist the impulse to refactor as I code or I get 'paralyzed'.

Should code be short/concise? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
When writing a mathematical proof, one goal is to continue compressing the proof. The proof gets more elegant but not necessarily more readable. Compression translates to better understanding, as you weed out unnecessary characters and verbosity.
I often hear developers say you should make your code foot print as small as possible. This can very quickly yield unreadable code. In mathematics, it isn't such an issue since the exercise is purely academic. However, in production code where time is money, having people try to figure out what some very concise code is doing doesn't seem to make much sense. For a little more verbose code, you get readability and savings.
At what point do you stop compressing software code?
I try to reach a level of verbosity where my program statements read like a sentence any programmer could understand. This does mean heavily refactoring my code such that it's all short pieces of a story, so each action would be described in a separate method (an even further level might be to another class).
Meaning I would not reduce my number of characters just because it can be expressed in fewer. That's what code-golf competitions are for.
My rule is say what you mean. One common way I see people go wrong is "strength reduction." Basically, they replace the concept they are thinking with something that seems to skip steps. Unfortunately, they are leaving concepts out of their code, making it harder to read.
For example, changing
for (int i = 0; i < n; i++)
foo[i] = ...
to
int * p = foo, q = foo+n;
while ( *p++ = ... < q );
is an example of a strength reduction that seems to save steps, but it leaves out the fact that foo is an array, making it harder to read.
Another common one is using bool instead of an enum.
enum {
MouseDown,
MouseUp
};
Having this be
bool IsMouseDown;
leaves out the fact that this is a state machine, making the code harder to maintain.
So my rule of thumb would be, in your implementation, don't dig down to a lower level than the concepts you are trying to express.
You can make code smaller by seeing redundancy and eliminating it, or by being clever. Do the former and not the latter.
Here's a good article by Steve McConnell - Best Practices http://www.stevemcconnell.com/ieeesoftware/bp06.htm
I think short/concise are two results from well written code. There are many aspects to make code good and many results from well written code, realize the two are different. You don't plan for a small foot print, you plan for a function that is concise and does a single thing extremely well - this SHOULD lead to a small foot print (but may not). Here's a short list of what I would focus on when writing code:
single focused functions - a function should do only one thing, a simple delivery, multi featured functions are buggy and not easily reusable
loosely coupled - don't reach out from inside one function to global data and don't rely heavily on other functions
precise naming - use meaningful precise variable names, cryptic names are just that
keep the code simple and not complex - don't over use language specific technical wow's, good for impressing others, difficult to easily understand and maintain - if you do add something 'special' comment it so at least people can appreciate it prior to cursing you out
evenly comment - to many comments will be ignored and outdated to few have no meaning
formatting - take pride in how the code looks, properly indented code helps
work with the mind of a code maintenance person - think what it would be like to maintain the code you're writting
do be afraid or to lazy to refactor - nothing is perfect the first time, clean up your own mess
One way to find a balance is to seek for readability and not concise-ness. Programmers are constantly scanning code visually to see what is being done, and so the code should as much as possible flow nicely.
If the programmer is scanning code and hits a section that is hard to understand, or takes some effort to visually parse and understand, it is a bad thing. Using common well understood constructs is important, stay away from the vague and infrequently used unless necessary.
Humans are not compilers. Compilers can eat the stuff and keep moving on. Obscure code is not mentally consumed by humans as quickly as clearly understood code.
At times it is very hard to produce readable code in a complicated algorithm, but for the most part, human readability is what we should look for, and not cleverness. I don't think length of code is really a measure of clearness either, because sometimes a more verbose method is more readable than a concise method, and sometimes a concise method is more readable than a long one.
Also, comments should only supplement, and should not describe your code, your code should describe itself. If you have to comment a line because it isn't obvious what is done, that is bad. It takes longer for most experienced programmers to read an English explanation than it does to read the code itself. I think the book Code Complete hammers this one home.
As far as object names go, the thinking on this has gone through an evolution with the introduction of new programming languages.
If you take the "curly brace" languages, starting with C, brevity was considered the soul of wit. So, you would have a variable to hold a loan value named "lv", for instance. The idea was that you were typing a lot of code, so keep the keystrokes to a minimum.
Then along came the Microsoft-sanctioned "Hungarian notation", where the first letters of a variable name were meant to indicate its underlying type. One might use "fLV", or some such, to indicate that the loan value was represented by a float variable.
With Java, and then C#, the paradigm has become one of clarity. A good name for a loan value variable would be "loanValue". I believe part of the reason for this is the command-completion feature in most modern editors. Since its not necessary to type an entire name anymore, you might as well use as many characters as is needed to be descriptive.
This is a good trend. Code needs to be intelligible. Comments are often added as an afterthought, if at all. They are also not updated as code is updated, so they become out of date. Descriptive, well-chosen, variable names are the first, best and easiest way to let others know what you were coding about.
I had a computer science professor who said "As engineers, we are constantly creating types of things that never existed before. The names that we give them will stick, so we should be careful to name things meaningfully."
There needs to be a balance between short sweet source code and performance. If it is nice source and runs the fastest, then good, but for the sake of nice source it runs like a dog, then bad.
Strive to refactor until the code itself reads well. You'll discover your own mistakes in the process, the code will be easier to grok for the "next guy", and you won't be burdened by maintaining (and later forgetting to change) in comments what you're already expressed in code.
When that fails... sure, leave me a comment.
And don't tell me "what" in the comment (that's what the code is for), tell me "why".
As opposed to long/rambling? Sure!
But it gets to the point where it's so short and so concise that it's hard to understand, then you've gone too far.
Yes. Always.
DRY: Don't Repeat Yourself. That will give you a code that is both concise and secure. Writing the same code several times is a good way to make it hard to maintain.
Now that does not mean you should make a function of any blocks of code looking remotely alike.
A very common error (horror ?) for instance is factorizing code doing nearly the same thing, and to handle the differences between occurences by adding a flag to function API. This may look inocuous at first, but generates code flow hard to understand and bug prone, and even harder to refactor.
If you follow common refactoring rules (looking about code smells) your code will become more and more concise as a side effect as many code smells are about detecting redundancy.
On the other hand, if you try to make the code as short as possible not following any meaningfull guidelines, at some point you will have to stop because you just won't see any more how to reduce code.
Just imagine if the first step is removing all useless whitespaces... after that step code in most programming languages will become so hard to read you won't have much chance to find any other possible enhancement.
The example above is quite caricatural, but not so far from what you get when trying to optimise for size without following any sensible guideline.
There's no exact line that can be drawn to distinguish between code that is glib and code that is flowery. Use your best judgment. Have others look at your code and see how easily they can understand it. But remember, correctness is the number 1 goal.
The need for small code footprints is a throwback from the days of assembly language and the first slightly high level languages... there small code footprints where a real and pressing need. These days though, its not so much of a necessity.
That said, I hate verbose code. Where I work, we write code that reads as much as possible like a natural language, without any extra grammar or words. And we don't abbreviate anything unless its a very common abbreviation.
Company.get_by_name("ABC")
makeHeaderTable()
is about as terse as we go.
In general, I make things obvious and easy to work with. If concision/shortness serves me in that end, all the better. Often short answers are the clearest, so shortness is a byproduct of obvious.
There are a couple points to my mind that determine when to stop optimizing:
Worth of spending time performing optimizations. If you have people spending weeks and not finding anything, are there better uses of those resources?
What is the order of optimization priority. There are a few different factors that one could care about when it comes to code: Execution time, execution space(both running and just the compiled code), scalability, stability, how many features are implemented, etc. Part of this is the trade off of time and space, but it can also be where does some code go, e.g. can middleware execute ad hoc SQL commands or should those be routed through stored procedures to improve performance?
I think the main point is that there is a moderation that most good solutions will have.
The code optimizations have little to do with the coding style. The fact that the file contains x spaces or new lines less than at the beginning does not make it better or faster, at least at the execution stage - you format the code with white characters that are unsually ignored by the compiler. It even makes the code worse, because it becomes unreadable for the other programmers and yourself.
It is much more important for the code to be short and clean in its logical structure, such as testing conditions, control flow, assumptions, error handling or the overall programming interface. Of course, I would also include here smart and useful comments + the documentation.
There is not necessarily a correlation between concise code and performance. This is a myth. In mature languages like C/C++ the compilers are capable of optimizing the code very effectively. There is cause need in such languages to assume that the more concise code is the better performing code. Newer, less performance-optimized languages like Ruby lack the compiler optimization features of C/C++ compilers, but there is still little reason to believe that concise code is better performing. The reality is that we never know how well code will perform in production until it gets into production and is profiled. Simple, innocuous, functions can be huge performance bottlenecks if called from enough locations within the code. In highly concurrent systems the biggest bottlenecks are generally caused by poor concurrency algorithms or excessive locking. These issues are rarely solved by writing "concise" code.
The bottom line is this: Code that performs poorly can always be refactored once profiling determines it is the bottleneck. Code can only be effectively refactored if it is easy to understand. Code that is written to be "concise" or "clever" is often more difficult to refactor and maintain.
Write your code for human readability then refactor for performance when necessary.
My two cents...
Code should be short, concrete, and concentrated. You can always explain your ideas with many words in the comments.
You can make your code as short or compact as you like as long as you comment it. This way your code can be optimized but still make sence. I tend to stay in the middle somewhere with descriptive variables and methods and sparce comments if it is still unclear.

Best practices: Many small functions/methods, or bigger functions with logical process components inline? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Is it better to write many small methods (or functions), or to simply write the logic/code of those small processes right into the place where you would have called the small method? What about breaking off code into a small function even if for the time being it is only called from one spot?
If one's choice depends on some criteria, what are they; how should a programmer make a good judgement call?
I'm hoping the answer can be applied generally across many languages, but if necessary, answers given can be specific to a language or languages. In particular, I'm thinking of SQL (functions, rules and stored procedures), Perl, PHP, Javascript and Ruby.
I always break long methods up into logical chunks and try to make smaller methods out of them. I don't normally turn a few lines into a separate method until I need it in two different places, but sometimes I do just to help readability, or if I want to test it in isolation.
Fowler's Refactoring is all about this topic, and I highly recommend it.
Here's a handy rule of thumb that I use from Refactoring. If a section of code has a comment that I could re-word into a method name, pull it out and make it a method.
The size of the method is directly linked to its cyclomatic complexity.
The main advantages to keep the size of the method small (which means dividing a big method into several small methods) are:
better unit testing (due to low cyclomatic complexity)
better debugging due to a more explicit stack trace (instead of one error within one giant method)
As always you can say: it depends. It's more a question of naming and defining the task of a method. Every method should do one (not more) well defined task and should do them completely. The name of the method should indicate the task. If your method is named DoAandB() it may be better to have separate methods DoA() and DoB(). If you need methods like setupTask, executeTask, FinishTask, it may be useful to combine them.
Some points that indicate, that a merge of different methods may be useful:
A method cannot be used alone, without the use of other methods.
You have to be careful to call some dependent methods in the right order.
Some points that indicate, that a splitup of the method could be useful:
Some lines of the existing method have clear independent task.
Unit-testing of the big method gets problematic. If tests are easier to write for independent methods, then split the big method up.
As an explanation to the unit-test-argument: I wrote a method, that did some things including IO. The IO-part was very hard to test, so I thought about it. I came to the conclusion, that my method did 5 logical and independent steps, and only one of them involved the IO. So I split up my method into 5 smaller ones, four of them were easy to test.
Small methods every time.
They are self documenting (er, if well named)
They break down the problem into manageable parts - you are KeepingItSimple.
You can use OO techniques to more easily (and obviously) plug in behaviour. The large method is by definition more procedural and so less flexible.
They are unit testable. This is the killer, you simply can’t unit test some huge method that performs a load of tasks
Something I learnt from The Code Complete book:
Write methods/functions so that it
implement one chunk(or unit or task)
of logic. If that requires breakdown
into sub tasks, then write a
seperate method/function for them
and call them.
If I find that the method/function
name is getting long then I try to
examine the method to see it it can
be broken down into two methods.
Hope this helps
Some rules of thumb:
Functions should not be longer than what can be displayed on screen
Break functions into smaller ones if it makes the code more readable.
I make each function do one thing, and one thing only, and I try not to nest too many levels of logic. Once you start breaking your code down into well named functions, it becomes a lot easier to read, and practically self-documenting.
I find that having many small methods makes code easier to read, maintain and debug.
When I'm reading through a unit that implements some business logic, I can better follow the flow if I see a series of method calls that describe the process. If I care about how the method is implemented, I can go look in the code.
It feels like more work but it ultimately saves time.
There is an art, I think, to knowing what to encapsulate. Everyone has some slight difference of opinion. If I could put it in words I'd say that each method should do one thing that can be described as a complete task.
The bigger the method, the harder to test and maintain. I find its much easier to understand how a large process works when its broken down into atomic steps. Also, doing this is a great first step to make your classes extensible. You can mark those individual steps as virtual (for inheritance), or move them into other objects (composition), making your application's behavior easier to customize.
I usually go for splitting functions into smaller functions that each perform a single, atomic task, but only if that function is complex enough to warrent it.
This way, I don't end up with multiple functions for simple tasks, and the functions I do extract can typically be used elsewhere as they don't try to achieve too much. This also aids unit testing as each function (as a logical, atomic action) can then be tested individually.
It depends a bit ... on mindset. Still, this is not an opinionated question.
The answer rather actually depends on the language context.
In a Java/C#/C++ world, where people are following the "Clean Code" school, as preached by Robert Martin, then: many small methods are the way to go.
A method has a clear name, and does one thing. One level of nesting, that's it. That limits its length to 3, 5, max 10 lines.
And honestly: I find this way of coding absolutely superior to any other "style".
The only downside of this approach is that you end up with many small methods, so ordering within a file/class can become an issue. But the answer to that is to use a decent IDE that allows to easily navigate forth and back.
So, the only "legit" reason to use the "all stuff goes into one method/function" is when your whole team works like that, and prefers that style. Or when you can't use decent tooling (but then navigating that big ugly function won't work either).
Personally, I lean significantly in the direction of preferring more, smaller methods, but not to the point of religiously aiming for a maximum line count. My primary criterion or goal is to keep my code DRY. The minute I have a code block which is duplicated (whether in spirit or actually by the text), even if it might be 2 or 4 lines long, I DRY up that code into a separate method. Sometimes I will do so in advance if I think there's a good chance it will be used again in the future.
On the flip side, I have also heard it argued that if your break-off method is too small, in the context of a team of developers, a teammate is likely not to know about your method, and will either write inline, or write his own small method that does the same thing. This is admittedly a bad situation.
Some also try to argue that it is more readable to keep things inline, so a reader can just read top-down, instead of having to jump around method definitions, possibly across multiple files. Personally, I think the existence of a stack trace makes this not much of an issue.

What optimizations are OK to do right away?

One of the most common mantras in computer science and programming is to never optimize prematurely, meaning that you should not optimize anything until a problem has been identified, since code readability/maintainability is likely to suffer.
However, sometimes you might know that a particular way of doing things will perform poorly. When is it OK to optimize before identifying a problem? What sorts of optimizations are allowable right from the beginning?
For example, using as few DB connections as possible, and paying close attention to that while developing, rather than using a new connection as needed and worrying about the performance cost later
I think you are missing the point of that dictum. There's nothing wrong with doing something the most efficient way possible right from the start, provided it's also clear, straight forward, etc.
The point is that you should not tie yourself (and worse, your code) in knots trying to solve problems that may not even exist. Save that level of extreme optimizations, which are often costly in terms of development, maintenance, technical debt, bug breeding grounds, portability, etc. for cases where you really need it.
I think you're looking at this the wrong way. The point of avoiding premature optimization isn't to avoid optimizing, it's to avoid the mindset you can fall into.
Write your algorithm in the clearest way that you can first. Then make sure it's correct. Then (and only then) worry about performance. But also think about maintenance etc.
If you follow this approach, then your question answers itself. The only "optimizations" that are allowable right from the beginning are those that are at least as clear as the straightforward approach.
The best optimization you can make at any time is to pick the correct algorithm for the problem. It's amazing how often a little thought yields a better approach that will save orders of magnitude, rather than a few percent. It's a complete win.
Things to look for:
Mathematical formulas rather than iteration.
Patterns that are well known and documented.
Existing code / components
IMHO, none. Write your code without ever thinking about "optimisation". Instead, think "clarity", "correctness", "maintainability" and "testability".
From wikipedia:
We should forget about small
efficiencies, say about 97% of the
time: premature optimization is the
root of all evil. Yet we should not
pass up our opportunities in that
critical 3%.
- Donald Knuth
I think that sums it up. The question is knowing if you are in the 3% and what route to take. Personally I ignore most optimizations until I at least get my code working. Usually as a separate pass with a profiler so I can make sure I am optimizing things that actually matter. Often times code simply runs fast enough that anything you do will have little or no effect.
If you don't have a performance problem, then you should not sacrifice readability for performance. However, when choosing a way to implement some functionality, you should avoid using code you know is problematic from a performance point of view. So if there are 2 ways to implement a function, choose the one likely to perform better, but if it's not the most intuitive solution, make sure you put in some comments as to why you coded it that way.
As you develop in your career as a developer, you'll simply grow in awareness of better, more reasonable approaches to various problems. In most cases I can think of,
performance enhancement work resulted in code that was actually smaller and simpler than some complex tangle that evolved from working through a problem. As you get better, such simpler, faster solutions just become easier and more natural to generate.
Update: I'm voting +1 for everyone on the thread so far because the answers are so good. In particular, DWC has captured the essence of my position with some wonderful examples.
Documentation
Documenting your code is the #1 optimization (of the development process) that you can do right from the get go. As projects grow, the more people you interact with and the more people need to understand what you wrote, the more time you will spend
Toolkits
Make sure your toolkit is appropriate for the application you're developing. If you're making a small app, there's no reason to invoke the mighty power of an Eclipse based GUI system.
Complilers
Let the compiler do the tough work. Most of the time, optimization switches on a compiler will do most of the important things you need.
System Specific Optimizations
Especially in the embedded world, gain an understanding of the underlying architecture of the CPU and system you're interacting with. For example, on a Coldfire CPU, you can gain large performance improvements by ensuring that your data lies on the proper byte boundary.
Algorithms
Strive to make access algorithms O(1) or O(Log N). Strive to make iteration over a list no more than O(N). If you're dealing with large amounts of data, avoid anything more than O(N^2) if it's at all possible.
Code Tricks
Avoid, if possible. This is an optimization in itself - an optimization to make your application more maintainable in the long run.
You should avoid all optimizations if the only belief that the code you are optimizing will be slow. The only code you should optimize is when you know it is slow (preferably through a profiler).
If you write clear, easy to understand code then odds are it'll be fast enough, and if it isn't then when you go to speed it up it should be easier to do.
That being said, common sense should apply (!). Should you read a file over and over again or should you cache the results? Probably cache the results. So from a high level architecture point of view you should be thinking of optimization.
The "evil" part of optimization is the "sins" that are committed in the name of making something faster - those sins generally result in the code being very hard to understand. I am not 100% sure this is one of them.. but look at this question here, this may or may not be an example of optimization (could be the way the person thought to do it), but there are more obvious ways to solve the problem than what was chosen.
Another thing you can do, which I recently did do, is when you are writing the code and you need to decide how to do something write it both ways and run it through a profiler. Then pick the clearest way to code it unless there is a large difference in speed/memory (depending on what you are after). That way you are not guessing at what is "better" and you can document why you did it that way so that someone doesn't change it later.
The case that I was doing was using memory mapped files -vs- stream I/O... the memory mapped file was significantly faster than the other way, so I wasn't concerned if the code was harder to follow (it wasn't) because the speed up was significant.
Another case I had was deciding to "intern" String in Java or not. Doing so should save space, but at a cost of time. In my case the space savings wasn't huge, and the time was double, so I didn't do the interning. Documenting it lets someone else know not to bother interning it (or if they want to see if a newer version of Java makes it faster then they can try).
In addition to being clear and straightforward, you also have to take a reasonable amount of time to implement the code correctly. If it takes you a day to get the code to work right, instead of the two hours it would have taken if you'd just written it, then you've quite possibly wasted time you could have spent on fixing the real performance problem (Knuth's 3%).
Agree with Neil's opinion here, doing performance optimizations in code right away is a bad development practice.
IMHO, performance optimization is dependent on your system design. If your system has been designed poorly, from the perspective of performance, no amount of code optimization will get you 'good' performance - you may get relatively better performance, but not good performance.
For instance, if one intends to build an application that accesses a database, a well designed data model, that has been de-normalized just enough, if likely to yield better performance characteristics than its opposite - a poorly designed data model that has been optimized/tuned to obtain relatively better performance.
Of course, one must not forget requirements in this mix. There are implicit performance requirements that one must consider during design - designing a public facing web site often requires that you reduce server-side trips to ensure a 'high-performance' feel to the end user. That doesn't mean that you rebuild the DOM on the browser on every action and repaint the same (I've seen this in reality), but that you rebuild a portion of the DOM and let the browser do the rest (which would have been handled by a sensible designer who understood the implicit requirements).
Picking appropriate data structures. I'm not even sure it counts as optimizing but it can affect the structure of your app (thus good to do early on) and greatly increase performance.
Don't call Collection.ElementCount directly in the loop check expression if you know for sure this value will be calculated on each pass.
Instead of:
for (int i = 0; i < myArray.Count; ++i)
{
// Do something
}
Do:
int elementCount = myArray.Count;
for (int i = 0; i < elementCount ; ++i)
{
// Do something
}
A classical case.
Of course, you have to know what kind of collection it is (actually, how the Count property/method is implemented). May not necessarily be costy.