Use of LOC to determine project size - language-agnostic

How many lines of code (LOC) does it take to be considered a large project? How about for just one person writing it?
I know this metric is questionable, but there is a significant difference, for a single developer, between 1k and 10k LOC. I typically use space for readability, especially for SQL statements, and I try to reduce the amount of LOC for maintenance purpose to follow as many best practice as i can.
For example, I created a unified diff of the code I modified today, and it was over 1k LOC (including comments and blank lines). Is "modified LOC" a better metric? I have ~2k LOC, so it's surprising I modified 1k. I guess rewriting counts as both a deletion and addition which doubles the stats.

A slightly less useless metric - time of compilation.
If your project takes more than... say, 30 minutes to compile, it's large :)

Using Steve Yegge as the benchmark at the upper range of the scale, let's say that 500k lines of code is (over?) the maximum a single developer can maintain.
More seriously though; I think once you hit 100k LOC you are probably going to want to start looking for re-factorings before extensions to the code.
Note however that one way around this limit is obviously to compartmentalise the code more. If the sum-total of all code consists of two or three large libraries and an application, then combined this may well be more than you could maintain as a single code-base, but as long as each library is nicely self-contained you aren't going to exceed the capacity to understand each part of the solution.

Maybe another measurement for this would be the COCOMO measure - even though it is probably as useless as LOC.
A single developer could only do organic projects - "small" teams with "good" experience working with "less than rigid" requirements.
In this case, efford applied in man months are calculated as
2.4 * (kLOC)^1.05
This said, 1kLOC would need 2.52 man month. You can use several factors to refine that, based on product, hardware, personel, and project attributes.
But all we have done now is projected LOC to a time measurement. Here you again have to decide whether a 2-month or 20-month project is considered large.
But as you said, LOC probably is not the right measure to use. Keywords: software metrics, function points, evidence based scheduling, the planing game.

In my opinion it also depends on the design of your code - i've worked on projects in the 1-10K loc range, that was so poorly designed, that it felt like a really large project.
But is LOC really an interesting meassure for code? ;-)

Related

dynamic or pre calculate data

a bit new to programming and had a general question that I just thought of.
Say, I have a database with a bunch of stock information and one column with price and another with earnings. To get the price/earning ratio, would it be better to calculate it everyday or to calculate it on demand? I think performance wise, it'd be quicker to read only but I'm wondering if for math type functions its worth the batch job to pre-calculate it(is it even noticeable?).
So how do the professionals do it? have the application process the data for them or have it already available in the database?
The professionals use a variety of methods. It all depends on what you're going for. Do the new real ratios need to be displayed immediately? How often is the core data changing? Ideally you would only calculate the ratio any time the price or earning changes, but this takes extra development, and it's probably not worth it if you don't have a substantial amount of activity on the site.
On the other hand, if you're receiving hundreds of visits every minute, you're definitely going to want to cache whatever you're calculating, as the time required to re-display a cached result is much less that recreating the result (in most scenarios).
However, as a general rule of thumb, don't get stuck trying to optimize something you haven't anticipated any performance issues with.
It would be good to keep statistical data as seperate table as those read only mode. you could calculate avarage, max, min values directly with SQL functions and save them. In mean time, for current period(day), you could dynamically calculate and show it. These statistical information can be use for reports or forcasting.
Pre-calculated value is (of course) faster.
However, it all depends on the requirement itself.
Does this value will be invoked frequently? If it's invoked frequently, then using a precalculated value will bring a huge advantage.
Does the calculation really need long time and/or huge resource? If so, using a precalculated will be helpful.
Please bear in mind, sometimes a slow process or a large resource consumption is caused by the programming implementation itself, not by a wrongly designed system.

Benchmarking: When can I stop making measurements?

I have a series of functions that are all designed to do the same thing. The same inputs produce the same outputs, but the time that it takes to do them varies by function. I want to determine which one is 'fastest', and I want to have some confidence that my measurement is 'statistically significant'.
Perusing Wikipedia and the interwebs tells me that statistical significance means that a measurement or group of measurements is different from a null hypothesis by a p-value threshold. How would that apply here? What is the null hypothesis between function A being faster than function B?
Once I've got that whole setup defined, how do I figure out when to stop measuring? I'll typically see that a benchmark is run three times, and then the average is reported; why three times and not five or seven? According to this page on Statistical Significance (which I freely admit I do not understand fully), Fisher used 8 as the number of samples that he needed to measure something with 98% confidence; why 8?
I would not bother applying statistics principles to benchmarking results. In general, the term "statistical significance" refers to the likelihood that your results were achieved accidentally, and do not represent an accurate assessment of the true values. In statistics, as a result of simple probability, the likelihood of a result being achieved by chance decreases as the number of measurements increases. In the benchmarking of computer code, it is a trivial matter to increase the number of trials (the "n" in statistics) so that the likelihood of an accidental result is below any arbitrary threshold you care to define (the "alpha" or level of statistical significance).
To simplify: benchmark by running your code a huge number of times, and don't worry about statistical measurements.
Note to potential down-voters of this answer: this answer is somewhat of a simplification of the matter, designed to illustrate the concepts in an accessible way. Comments like "you clearly don't understand statistics" will result in a savage beat-down. Remember to be polite.
You are asking two questions:
How do you perform a test of statistical significance that the mean time of function A is greater than the mean time of function B?
If you want a certain confidence in your answer, how many samples should you take?
The most common answer to the first question is that you either want to compute a confidence interval or perform a t-test. It's not different than any other scientific experiment with random variation. To compute the 95% confidence interval of the mean response time for function A simply take the mean and add 1.96 times the standard error to either side. The standard error is the square root of the variance divided by N. That is,
95% CI = mean +/- 1.96 * sqrt(sigma2/N))
where sigma2 is the variance of speed for function A and N is the number of runs you used to calculate mean and variance.
Your second question relates to statistical power analysis and the design of experiments. You describe a sequential setup where you are asking whether to continue sampling. The design of sequential experiments is actually a very tricky problem in statistics, since in general you are not allowed to calculate confidence intervals or p-values and then draw additional samples conditional on not reaching your desired significance. If you wish to do this, it would be wiser to set up a Bayesian model and calculate your posterior probability that speed A is greater than speed B. This, however, is massive overkill.
In a computing environment it is generally pretty trivial to achieve a very small confidence interval both because drawing large N is easy and because the variance is generally small -- one function obviously wins.
Given that Wikipedia and most online sources are still horrible when it comes to statistics, I recommend buying Introductory Statistics with R. You will learn both the statistics and the tools to apply what you learn.
The research you site sounds more like a highly controlled environment. This is purely a practical answer that has proven itself time and again to be effective for performance testing.
If you are benchmarking code in a modern, multi-tasking, multi-core, computing environment, the number of iterations required to achieve a useful benchmark goes up as the length of time of the operation to be measured goes down.
So, if you have an operation that takes ~5 seconds, you'll want, typically, 10 to 20 iterations. As long as the deviation across the iterations remains fairly constant, then your data is sound enough to draw conclusions. You'll often want to throw out the first iteration or two because the system is typically warming up caches, etc...
If you are testing something in the millisecond range, you'll want 10s of thousands of iterations. This will eliminate noise caused by other processes, etc, firing up.
Once you hit the sub-millisecond range -- 10s of nanoseconds -- you'll want millions of iterations.
Not exactly scientific, but neither is testing "in the real world" on a modern computing system.
When comparing the results, consider the difference in execution speed as percentage, not absolute. Anything less than about 5% difference is pretty close to noise.
Do you really care about statistical significance or plain old significance? Ultimately you're likely to have to form a judgement about readability vs performance - and statistical significance isn't really going to help you there.
A couple of rules of thumb I use:
Where possible, test for enough time to make you confident that little blips (like something else interrupting your test for a short time) won't make much difference. Usually I reckon 30 seconds is enough for this, although it depends on your app. The longer you test for, the more reliable the test will be - but obviously your results will be delayed :)
Running a test multiple times can be useful, but if you're timing for long enough then it's not as important IMO. It would alleviate other forms of error which made a whole test take longer than it should. If a test result looks suspicious, certainly run it again. If you see significantly different results for different runs, run it several more times and try to spot a pattern.
The fundamental question you're trying to answer is how likley is it that what you observe could have happened by chance? Is this coin fair? Throw it once: HEADS. No it's not fair it always comes down heads. Bad conclusion! Throw it 10 times and get 7 Heads, now what do you conclude? 1000 times and 700 heads?
For simple cases we can imagine how to figure out when to stop testing. But you have a slightly different situation - are you really doing a statistical analysis?
How much control do you have of your tests? Does repeating them add any value? Your computer is deterministic (maybe). Eistein's definition of insanity is to repeat something and expect a different outcome. So when you run your tests do you get repeatable answers? I'm not sure that statistical analyses help if you are doing good enough tests.
For what you're doing I would say that the first key thing is to make sure that you really are measuring what you think. Run every test for long enough that any startup or shutdown effects are hidden. Useful performance tests tend to run for quite extended periods for that reason. Make sure that you are not actually measuing the time in your test harness rather than the time in your code.
You have two primary variables: how many iterations of your method to run in one test? How many tests to run?
Wikipedia says this
In addition to expressing the
variability of a population, standard
deviation is commonly used to measure
confidence in statistical conclusions.
For example, the margin of error in
polling data is determined by
calculating the expected standard
deviation in the results if the same
poll were to be conducted multiple
times. The reported margin of error is
typically about twice the standard
deviation.
Hence if your objective is to be sure that one function is faster than another you could run a number of tests of each, compute the means and standard deviations. My expectation is that if your number of iterations within any one test is high then the standard deviation is going to be low.
If we accept that defintion of margin of error, you can see whether the two means are further apart than their total margin's of error.

How does one deal with Hofstadter's law?

When estimating tasks, how does one break from the grip of Hofstadter's law?
If you can politically: Estimate in small chunks, work in small iterations, and focus attention on what caused the deviation from the estimate to make the next estimate better.
One of the major causes of bad estimates in my experience is the lack of experience actually using the architecture planned for the project. By adjusting the estimates as things become more concrete and clear the estimates get better over time.
The other major cause of bad estimates is bogus estimates. Estimates kept artificially low to win a bid. The only way a consulting firm can break that cycle is give good estimates and win enough projects and deliver on the estimates to earn a reputation that they hit their estimates. Enough clients will respect that to make a reasonable business out of it, but building that up will be hard.
Hofstadter's Law is not meant to be taken seriously --- if it were true to the letter, every task would take an infinite amount of time if you took Hofstadter's Law into account.
Estimate how long time something should take to code.
Multiply by pi.
Be amazed by how often that is closer to how long it actually takes.
(This is also not to be taken as a scientific method, but it is another way of expressing how hard it is to correctly estimate time. I really use it sometimes, though...)
:)
Edit:
A method that is a bit more scientific: Specify a time for the absolute minimum and maximum time for a task, for example that it will definitely take between 5 and 30 hours. (Divide into subtasks to possibly narrow the time span somewhat.) You get a very wide time span, but at least it's more reliable than a guesstimate.
While "Hofstadter's Law" is a bit tongue-in-cheek, there are a couple of practices that can help you, in particular for first-pass/large item estimation:
Estimate in relative sizes. Meaning you don't say that an item takes X time, you say that an item A is twice as big as item B, and that item B is about 4 time as large as item C.
Gather data from previous estimating rounds and use it as a base line. So that when you are estimating a project, and notice that item A is about as big as item B from a previous iteration/project, and you know that item B has taken 2 days, you know that item A will most likely take about as long
Use "wisdom-of-the-crowds" to get higher quality estimates. I've used Planning Poker in a couple of projects and the outcomes are rather good.
If you want to know more about this you can start by watch Mike Cohn's presentation (Part 1 and Part 2) and/or read his book. While it's not the end-all,be-all of estimation, he does present some good practices and best of all, the reasoning behind the practices.
See Evidence-Based Scheduling. There is already a SO discussion of some of its pitfalls here.
I used dice. Openly. In front of my manager. Typically I use 3 standard six-sided dice.
Boss: "How long is this going to take?"
Me: (rolls) "About 11 days."
Boss: "No, seriously."
Me: "Oh, seriously." (rolls) "About 7 days."
I also used to have a poster on my wall that said "Deadlines Amuse Me". Take from that what you will.
Base you estimates on past performance, not on best case scenarios. This does require you keep track of time spent on your projects. I don't care if you "know" that it will only take "6 weeks" to finish, if it took you 3 months to complete a similar project last time, it will probably take you 3 months the next time.
+1 for #Yishai - one of the benefits of an agile methodology like scrum is that people actually get feedback on the accuracy of their estimates.
You're never going to get better at something if you never know if you're wrong...
I like this method:
Make an honest estimate of the effort required for the task.
Apply a multiplier to the estimate. At least 1.5 probably 2.0. With time comparing actual effort to estimated effort you will be able to calculate the true multiplier.
Collecting the estimated and actual efforts are key to improving your estimates.
Agile estimation always uses "ideal hours" which implicitly takes into account Hofstadter's law. So you don't need to fudge.
If you're answering as an employee ...
"Gee, boss, In a perfect world it
would take X days. Let's add a
cushion to it and I'll do all I can to
get it to you in that amount of time.
If the estimate changes I'll let you
know immediately."
That is music to a boss's ears!
If you're answering as the business owner ...
You only give estimates to your customers when backed into a corner. Then you use ideal days with clear disclaimers and be ready to adjust because you're aware of Hofstadter's law.
Estimating is an art, as you know and there is a sub art that is the art of estimating contingency. :) In order to properly estimate contingency (generally a % of a total estimate), one must understand risks and mitigations. Basically, you multiply the risk of something happening with the damage it can do to come up with a risk factor. Then, you sum up all your risk factors and estimate your total risk. Contingency should range from 15% for very low risk projects (I never go below 15% contingency) to 50% for very high risk (it has been my experience that very few customers will except a higher than 50% contingency estimate).
Hofstadter's law is just another implication of how notorious self-referencing is !..... the subtle humour has far reaching effects. In hindsight this law confirms that every law/principle/axiom which is structured by logic, is incomplete (Godel like), thus even taking such laws into concern, the logic may never be complete. The sense of infinity is again a play from Zeno's paradox (turtle vs. Achilles).... infinite time for Achilles to complete the race ....etc .... these are illustrations of the omnipotent evil of self referencing which contaminates all affine logical structure.

"Proper" way to give clients or managers a reality check on software estimates

Looking back at my past projects I often encounter this one:
A client or a manager presents a task to me and asks for an estimate. I give an estimate say 24 hours. They also ask a business analyst and from what I've heard their experience is mostly non-technical. They give an estimate say 16 hours. In the end, they would consider the value given by the analyst even though aside from providing an estimate on my side, I've explained to them the feasibility of the task on the technical side. They treat the analysts estimate as a "fact of life" even though it is only an estimate and the true value is in the actual task itself. Worse, I see a pattern that they tend to be biased in choosing the lower value (say I presented a lower value estimate than the analyst, they quickly consider it) compared to the feasibility of the task. If you have read Peopleware, they are the types of people who given a set of work hours will do anything and everything in their power to shorten in even though that is not really possible.
Do you have specific negotiation skills and tactics that you used before to avoid this?
If I can help it, I would almost never give a number like "24 hours". Doing so makes several implicit assumptions:
The estimate is accurate to within an hour.
All of the figures in the number are significant figures.
The estimate is not sensitive to conditions that may arise between the time you give the estimate and the time the work is complete.
In most cases these are demonstrably wrong. To avoid falling into the trap posed by (1), quote ranges to reflect how uncertain you are about the accuracy of the estimate: "3 weeks, plus or minus 3 days". This also takes care of (2).
To close the loophole of (3), state your assumptions explicitly: "3 weeks, plus or minutes 3 days, assuming Alice and Bob finish the Frozzbozz component".
IMO, being explicit about your assumptions this way will show a greater depth of thought than the analyst's POV. I'd much rather pay attention to someone who's thought about this more intensely than someone who just pulled a number out of the air, and that will certainly count for plus points on your side of the negotiation.
Do you not have a work breakdown structure that validates your estimate?
If your manager/customer does not trust your estimate, you should be able to easily prove it beyond the ability of an analyst.
Nothing makes your estimate intrinsically better than his beyond the breakdown that shows it to be true. Something like this for example:
Gather Feature Requirements (2 hours)
Design Feature (4 hours)
Build Feature
1 easy form (4 hours)
1 easy business component (4 hours)
1 easy stored procedure (2 hours)
Test Feature
3 easy unit tests (4 hours)
1 regression test (4 hours)
Deploy Feature
1 easy deployment (4 hours)
==========
(28 hours)
Then you say "Okay, I came up with 28 hours, show me where I am wrong. Show me how you can do it in 16."
Sadly scott adams had a lot to contribute to this debate
Dilbert: "In a perfect world the project would take eight months. But based on past projects in this company, I applied a 1.5 incompetence multiplier. And then I applied an LWF of 6.3."
Pointy-Haired Boss: "LWF?"
Alice: "Lying Weasel Factor."
You can "control" clients a little easier than managers since the only power they really have is to not give the work to you (that solves your incorrect estimates problem pretty quickly).
But you just need to point out that it's not the analyst doing the work, it's you. And nobody is better at judging your times than you are.
It's a fact of life that people paying for the work (including managers) will focus on the lower figure. Many times I've submitted proper estimates with lower (e.g., $10.000) and upper bounds (e.g., $11,000) and had emails back saying that the clients were quite happy that I'd quoted $10,000 for the work.
Then, for some reason, they take umbrage when I bill them $10,500. You have to make it clear up front that estimates are, well, estimates, not guarantees. Otherwise they wouldn't be paying time-and-materials but fixed-price (and the fixed price would be considerably higher to cover the fact that the risk is now yours, not theirs).
In addition, you should include all assumptions and risks in any quotes you give. This will both cover you and demonstrate that your estimate is to be taken more seriously than some back-of-an-envelope calculation.
One thing you can do to try to fix this over time, and improve your estimating skills as well, is to track all of the estimates you make, and match those up with the actual time taken. If you can go back to your boss with a list of the last twenty estimates from both you and the business analyst, and the time each actually took, it will be readily apparent whose estimates you should trust.
Under no circumstances give a single figure, give a best, worst and a most likely. If you respond correctly then the next question should be "How do I get a more accurate number" to which the answer should be more detailed requirements and/or design depending where you are in the lifecycle.
Then you give another more refined range of best .. most ... likely and wost. This continues until you are done.
This is known as the cone of uncertanty I have lost count of the number of times I have drawn it on a whiteboard when talking estimates with clients.
Do you have specific negotiation skills and tactics that you used before to avoid this?
Don't work for such people.
Seriously.
Changing their behavior is beyond your control.

How many units should there be in each generation of a genetic algorithm?

I am working on a roguelike and am using a GA to generate levels. My question is, how many levels should be in each generation of my GA? And, how many generations should it have? It it better to have a few levels in each generation, with many generations, or the other way around?
There really isn't a hard and fast rule for this type of thing - most experiments like to use at least 200 members in a population at the barest minimum, scaling up to millions or more. The number of generations is usually in the 100 to 10,000 range. In general, to answer your final question, it's better to have lots of members in the population so that "late-bloomer" genes stay in a population long enough to mature, and then use a smaller number of generations.
But really, these aren't the important thing. The most critical part of any GA is the fitness function. If you don't have a decent fitness function that accurately evaluates what you consider to be a "good" level or a "bad" level, you're not going to end up with interesting results no matter how many generations you use, or how big your population is :)
Just as Mike said, you need to try different numbers. If you have a large population, you need to make sure to have a good selection function. With a large population, it is very easy to cause the GA to converge to a "not so good" answer early on.