When estimating tasks, how does one break from the grip of Hofstadter's law?
If you can politically: Estimate in small chunks, work in small iterations, and focus attention on what caused the deviation from the estimate to make the next estimate better.
One of the major causes of bad estimates in my experience is the lack of experience actually using the architecture planned for the project. By adjusting the estimates as things become more concrete and clear the estimates get better over time.
The other major cause of bad estimates is bogus estimates. Estimates kept artificially low to win a bid. The only way a consulting firm can break that cycle is give good estimates and win enough projects and deliver on the estimates to earn a reputation that they hit their estimates. Enough clients will respect that to make a reasonable business out of it, but building that up will be hard.
Hofstadter's Law is not meant to be taken seriously --- if it were true to the letter, every task would take an infinite amount of time if you took Hofstadter's Law into account.
Estimate how long time something should take to code.
Multiply by pi.
Be amazed by how often that is closer to how long it actually takes.
(This is also not to be taken as a scientific method, but it is another way of expressing how hard it is to correctly estimate time. I really use it sometimes, though...)
:)
Edit:
A method that is a bit more scientific: Specify a time for the absolute minimum and maximum time for a task, for example that it will definitely take between 5 and 30 hours. (Divide into subtasks to possibly narrow the time span somewhat.) You get a very wide time span, but at least it's more reliable than a guesstimate.
While "Hofstadter's Law" is a bit tongue-in-cheek, there are a couple of practices that can help you, in particular for first-pass/large item estimation:
Estimate in relative sizes. Meaning you don't say that an item takes X time, you say that an item A is twice as big as item B, and that item B is about 4 time as large as item C.
Gather data from previous estimating rounds and use it as a base line. So that when you are estimating a project, and notice that item A is about as big as item B from a previous iteration/project, and you know that item B has taken 2 days, you know that item A will most likely take about as long
Use "wisdom-of-the-crowds" to get higher quality estimates. I've used Planning Poker in a couple of projects and the outcomes are rather good.
If you want to know more about this you can start by watch Mike Cohn's presentation (Part 1 and Part 2) and/or read his book. While it's not the end-all,be-all of estimation, he does present some good practices and best of all, the reasoning behind the practices.
See Evidence-Based Scheduling. There is already a SO discussion of some of its pitfalls here.
I used dice. Openly. In front of my manager. Typically I use 3 standard six-sided dice.
Boss: "How long is this going to take?"
Me: (rolls) "About 11 days."
Boss: "No, seriously."
Me: "Oh, seriously." (rolls) "About 7 days."
I also used to have a poster on my wall that said "Deadlines Amuse Me". Take from that what you will.
Base you estimates on past performance, not on best case scenarios. This does require you keep track of time spent on your projects. I don't care if you "know" that it will only take "6 weeks" to finish, if it took you 3 months to complete a similar project last time, it will probably take you 3 months the next time.
+1 for #Yishai - one of the benefits of an agile methodology like scrum is that people actually get feedback on the accuracy of their estimates.
You're never going to get better at something if you never know if you're wrong...
I like this method:
Make an honest estimate of the effort required for the task.
Apply a multiplier to the estimate. At least 1.5 probably 2.0. With time comparing actual effort to estimated effort you will be able to calculate the true multiplier.
Collecting the estimated and actual efforts are key to improving your estimates.
Agile estimation always uses "ideal hours" which implicitly takes into account Hofstadter's law. So you don't need to fudge.
If you're answering as an employee ...
"Gee, boss, In a perfect world it
would take X days. Let's add a
cushion to it and I'll do all I can to
get it to you in that amount of time.
If the estimate changes I'll let you
know immediately."
That is music to a boss's ears!
If you're answering as the business owner ...
You only give estimates to your customers when backed into a corner. Then you use ideal days with clear disclaimers and be ready to adjust because you're aware of Hofstadter's law.
Estimating is an art, as you know and there is a sub art that is the art of estimating contingency. :) In order to properly estimate contingency (generally a % of a total estimate), one must understand risks and mitigations. Basically, you multiply the risk of something happening with the damage it can do to come up with a risk factor. Then, you sum up all your risk factors and estimate your total risk. Contingency should range from 15% for very low risk projects (I never go below 15% contingency) to 50% for very high risk (it has been my experience that very few customers will except a higher than 50% contingency estimate).
Hofstadter's law is just another implication of how notorious self-referencing is !..... the subtle humour has far reaching effects. In hindsight this law confirms that every law/principle/axiom which is structured by logic, is incomplete (Godel like), thus even taking such laws into concern, the logic may never be complete. The sense of infinity is again a play from Zeno's paradox (turtle vs. Achilles).... infinite time for Achilles to complete the race ....etc .... these are illustrations of the omnipotent evil of self referencing which contaminates all affine logical structure.
Related
I have a series of functions that are all designed to do the same thing. The same inputs produce the same outputs, but the time that it takes to do them varies by function. I want to determine which one is 'fastest', and I want to have some confidence that my measurement is 'statistically significant'.
Perusing Wikipedia and the interwebs tells me that statistical significance means that a measurement or group of measurements is different from a null hypothesis by a p-value threshold. How would that apply here? What is the null hypothesis between function A being faster than function B?
Once I've got that whole setup defined, how do I figure out when to stop measuring? I'll typically see that a benchmark is run three times, and then the average is reported; why three times and not five or seven? According to this page on Statistical Significance (which I freely admit I do not understand fully), Fisher used 8 as the number of samples that he needed to measure something with 98% confidence; why 8?
I would not bother applying statistics principles to benchmarking results. In general, the term "statistical significance" refers to the likelihood that your results were achieved accidentally, and do not represent an accurate assessment of the true values. In statistics, as a result of simple probability, the likelihood of a result being achieved by chance decreases as the number of measurements increases. In the benchmarking of computer code, it is a trivial matter to increase the number of trials (the "n" in statistics) so that the likelihood of an accidental result is below any arbitrary threshold you care to define (the "alpha" or level of statistical significance).
To simplify: benchmark by running your code a huge number of times, and don't worry about statistical measurements.
Note to potential down-voters of this answer: this answer is somewhat of a simplification of the matter, designed to illustrate the concepts in an accessible way. Comments like "you clearly don't understand statistics" will result in a savage beat-down. Remember to be polite.
You are asking two questions:
How do you perform a test of statistical significance that the mean time of function A is greater than the mean time of function B?
If you want a certain confidence in your answer, how many samples should you take?
The most common answer to the first question is that you either want to compute a confidence interval or perform a t-test. It's not different than any other scientific experiment with random variation. To compute the 95% confidence interval of the mean response time for function A simply take the mean and add 1.96 times the standard error to either side. The standard error is the square root of the variance divided by N. That is,
95% CI = mean +/- 1.96 * sqrt(sigma2/N))
where sigma2 is the variance of speed for function A and N is the number of runs you used to calculate mean and variance.
Your second question relates to statistical power analysis and the design of experiments. You describe a sequential setup where you are asking whether to continue sampling. The design of sequential experiments is actually a very tricky problem in statistics, since in general you are not allowed to calculate confidence intervals or p-values and then draw additional samples conditional on not reaching your desired significance. If you wish to do this, it would be wiser to set up a Bayesian model and calculate your posterior probability that speed A is greater than speed B. This, however, is massive overkill.
In a computing environment it is generally pretty trivial to achieve a very small confidence interval both because drawing large N is easy and because the variance is generally small -- one function obviously wins.
Given that Wikipedia and most online sources are still horrible when it comes to statistics, I recommend buying Introductory Statistics with R. You will learn both the statistics and the tools to apply what you learn.
The research you site sounds more like a highly controlled environment. This is purely a practical answer that has proven itself time and again to be effective for performance testing.
If you are benchmarking code in a modern, multi-tasking, multi-core, computing environment, the number of iterations required to achieve a useful benchmark goes up as the length of time of the operation to be measured goes down.
So, if you have an operation that takes ~5 seconds, you'll want, typically, 10 to 20 iterations. As long as the deviation across the iterations remains fairly constant, then your data is sound enough to draw conclusions. You'll often want to throw out the first iteration or two because the system is typically warming up caches, etc...
If you are testing something in the millisecond range, you'll want 10s of thousands of iterations. This will eliminate noise caused by other processes, etc, firing up.
Once you hit the sub-millisecond range -- 10s of nanoseconds -- you'll want millions of iterations.
Not exactly scientific, but neither is testing "in the real world" on a modern computing system.
When comparing the results, consider the difference in execution speed as percentage, not absolute. Anything less than about 5% difference is pretty close to noise.
Do you really care about statistical significance or plain old significance? Ultimately you're likely to have to form a judgement about readability vs performance - and statistical significance isn't really going to help you there.
A couple of rules of thumb I use:
Where possible, test for enough time to make you confident that little blips (like something else interrupting your test for a short time) won't make much difference. Usually I reckon 30 seconds is enough for this, although it depends on your app. The longer you test for, the more reliable the test will be - but obviously your results will be delayed :)
Running a test multiple times can be useful, but if you're timing for long enough then it's not as important IMO. It would alleviate other forms of error which made a whole test take longer than it should. If a test result looks suspicious, certainly run it again. If you see significantly different results for different runs, run it several more times and try to spot a pattern.
The fundamental question you're trying to answer is how likley is it that what you observe could have happened by chance? Is this coin fair? Throw it once: HEADS. No it's not fair it always comes down heads. Bad conclusion! Throw it 10 times and get 7 Heads, now what do you conclude? 1000 times and 700 heads?
For simple cases we can imagine how to figure out when to stop testing. But you have a slightly different situation - are you really doing a statistical analysis?
How much control do you have of your tests? Does repeating them add any value? Your computer is deterministic (maybe). Eistein's definition of insanity is to repeat something and expect a different outcome. So when you run your tests do you get repeatable answers? I'm not sure that statistical analyses help if you are doing good enough tests.
For what you're doing I would say that the first key thing is to make sure that you really are measuring what you think. Run every test for long enough that any startup or shutdown effects are hidden. Useful performance tests tend to run for quite extended periods for that reason. Make sure that you are not actually measuing the time in your test harness rather than the time in your code.
You have two primary variables: how many iterations of your method to run in one test? How many tests to run?
Wikipedia says this
In addition to expressing the
variability of a population, standard
deviation is commonly used to measure
confidence in statistical conclusions.
For example, the margin of error in
polling data is determined by
calculating the expected standard
deviation in the results if the same
poll were to be conducted multiple
times. The reported margin of error is
typically about twice the standard
deviation.
Hence if your objective is to be sure that one function is faster than another you could run a number of tests of each, compute the means and standard deviations. My expectation is that if your number of iterations within any one test is high then the standard deviation is going to be low.
If we accept that defintion of margin of error, you can see whether the two means are further apart than their total margin's of error.
How many lines of code (LOC) does it take to be considered a large project? How about for just one person writing it?
I know this metric is questionable, but there is a significant difference, for a single developer, between 1k and 10k LOC. I typically use space for readability, especially for SQL statements, and I try to reduce the amount of LOC for maintenance purpose to follow as many best practice as i can.
For example, I created a unified diff of the code I modified today, and it was over 1k LOC (including comments and blank lines). Is "modified LOC" a better metric? I have ~2k LOC, so it's surprising I modified 1k. I guess rewriting counts as both a deletion and addition which doubles the stats.
A slightly less useless metric - time of compilation.
If your project takes more than... say, 30 minutes to compile, it's large :)
Using Steve Yegge as the benchmark at the upper range of the scale, let's say that 500k lines of code is (over?) the maximum a single developer can maintain.
More seriously though; I think once you hit 100k LOC you are probably going to want to start looking for re-factorings before extensions to the code.
Note however that one way around this limit is obviously to compartmentalise the code more. If the sum-total of all code consists of two or three large libraries and an application, then combined this may well be more than you could maintain as a single code-base, but as long as each library is nicely self-contained you aren't going to exceed the capacity to understand each part of the solution.
Maybe another measurement for this would be the COCOMO measure - even though it is probably as useless as LOC.
A single developer could only do organic projects - "small" teams with "good" experience working with "less than rigid" requirements.
In this case, efford applied in man months are calculated as
2.4 * (kLOC)^1.05
This said, 1kLOC would need 2.52 man month. You can use several factors to refine that, based on product, hardware, personel, and project attributes.
But all we have done now is projected LOC to a time measurement. Here you again have to decide whether a 2-month or 20-month project is considered large.
But as you said, LOC probably is not the right measure to use. Keywords: software metrics, function points, evidence based scheduling, the planing game.
In my opinion it also depends on the design of your code - i've worked on projects in the 1-10K loc range, that was so poorly designed, that it felt like a really large project.
But is LOC really an interesting meassure for code? ;-)
Looking back at my past projects I often encounter this one:
A client or a manager presents a task to me and asks for an estimate. I give an estimate say 24 hours. They also ask a business analyst and from what I've heard their experience is mostly non-technical. They give an estimate say 16 hours. In the end, they would consider the value given by the analyst even though aside from providing an estimate on my side, I've explained to them the feasibility of the task on the technical side. They treat the analysts estimate as a "fact of life" even though it is only an estimate and the true value is in the actual task itself. Worse, I see a pattern that they tend to be biased in choosing the lower value (say I presented a lower value estimate than the analyst, they quickly consider it) compared to the feasibility of the task. If you have read Peopleware, they are the types of people who given a set of work hours will do anything and everything in their power to shorten in even though that is not really possible.
Do you have specific negotiation skills and tactics that you used before to avoid this?
If I can help it, I would almost never give a number like "24 hours". Doing so makes several implicit assumptions:
The estimate is accurate to within an hour.
All of the figures in the number are significant figures.
The estimate is not sensitive to conditions that may arise between the time you give the estimate and the time the work is complete.
In most cases these are demonstrably wrong. To avoid falling into the trap posed by (1), quote ranges to reflect how uncertain you are about the accuracy of the estimate: "3 weeks, plus or minus 3 days". This also takes care of (2).
To close the loophole of (3), state your assumptions explicitly: "3 weeks, plus or minutes 3 days, assuming Alice and Bob finish the Frozzbozz component".
IMO, being explicit about your assumptions this way will show a greater depth of thought than the analyst's POV. I'd much rather pay attention to someone who's thought about this more intensely than someone who just pulled a number out of the air, and that will certainly count for plus points on your side of the negotiation.
Do you not have a work breakdown structure that validates your estimate?
If your manager/customer does not trust your estimate, you should be able to easily prove it beyond the ability of an analyst.
Nothing makes your estimate intrinsically better than his beyond the breakdown that shows it to be true. Something like this for example:
Gather Feature Requirements (2 hours)
Design Feature (4 hours)
Build Feature
1 easy form (4 hours)
1 easy business component (4 hours)
1 easy stored procedure (2 hours)
Test Feature
3 easy unit tests (4 hours)
1 regression test (4 hours)
Deploy Feature
1 easy deployment (4 hours)
==========
(28 hours)
Then you say "Okay, I came up with 28 hours, show me where I am wrong. Show me how you can do it in 16."
Sadly scott adams had a lot to contribute to this debate
Dilbert: "In a perfect world the project would take eight months. But based on past projects in this company, I applied a 1.5 incompetence multiplier. And then I applied an LWF of 6.3."
Pointy-Haired Boss: "LWF?"
Alice: "Lying Weasel Factor."
You can "control" clients a little easier than managers since the only power they really have is to not give the work to you (that solves your incorrect estimates problem pretty quickly).
But you just need to point out that it's not the analyst doing the work, it's you. And nobody is better at judging your times than you are.
It's a fact of life that people paying for the work (including managers) will focus on the lower figure. Many times I've submitted proper estimates with lower (e.g., $10.000) and upper bounds (e.g., $11,000) and had emails back saying that the clients were quite happy that I'd quoted $10,000 for the work.
Then, for some reason, they take umbrage when I bill them $10,500. You have to make it clear up front that estimates are, well, estimates, not guarantees. Otherwise they wouldn't be paying time-and-materials but fixed-price (and the fixed price would be considerably higher to cover the fact that the risk is now yours, not theirs).
In addition, you should include all assumptions and risks in any quotes you give. This will both cover you and demonstrate that your estimate is to be taken more seriously than some back-of-an-envelope calculation.
One thing you can do to try to fix this over time, and improve your estimating skills as well, is to track all of the estimates you make, and match those up with the actual time taken. If you can go back to your boss with a list of the last twenty estimates from both you and the business analyst, and the time each actually took, it will be readily apparent whose estimates you should trust.
Under no circumstances give a single figure, give a best, worst and a most likely. If you respond correctly then the next question should be "How do I get a more accurate number" to which the answer should be more detailed requirements and/or design depending where you are in the lifecycle.
Then you give another more refined range of best .. most ... likely and wost. This continues until you are done.
This is known as the cone of uncertanty I have lost count of the number of times I have drawn it on a whiteboard when talking estimates with clients.
Do you have specific negotiation skills and tactics that you used before to avoid this?
Don't work for such people.
Seriously.
Changing their behavior is beyond your control.
I am working on a roguelike and am using a GA to generate levels. My question is, how many levels should be in each generation of my GA? And, how many generations should it have? It it better to have a few levels in each generation, with many generations, or the other way around?
There really isn't a hard and fast rule for this type of thing - most experiments like to use at least 200 members in a population at the barest minimum, scaling up to millions or more. The number of generations is usually in the 100 to 10,000 range. In general, to answer your final question, it's better to have lots of members in the population so that "late-bloomer" genes stay in a population long enough to mature, and then use a smaller number of generations.
But really, these aren't the important thing. The most critical part of any GA is the fitness function. If you don't have a decent fitness function that accurately evaluates what you consider to be a "good" level or a "bad" level, you're not going to end up with interesting results no matter how many generations you use, or how big your population is :)
Just as Mike said, you need to try different numbers. If you have a large population, you need to make sure to have a good selection function. With a large population, it is very easy to cause the GA to converge to a "not so good" answer early on.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Here is my situation with rough numbers. I had like to know if my thinking (at the bottom) seems sound to you guys. (Side note: I've read many of the related questions on here, and helpful as they were, none seemed to touch on this specific issue.)
For 2 years I was a senior developer at Company X. I was full-time, W-2, and making $100k/yr with benefits. (Roughly $50/hr).
[Then I got laid off, but that's not the point. I am in a large city and can find work easily. I am very happy to work from home rather than in an office.]
For 2 months I've done a few freelance projects for Company Y, a web firm. This was 1099, and I am charging $80/hr. (I did 100 or so hours over 2 or so months and figured I'd need to get some other clients soon).
Company Y loves my work and has gained new jobs because of it. They want more of my time and have offered me a 6 month contract, paid a fixed monthly rate regardless of hours (they assume 40ish per week). I'd still be working remotely.
So...
My freelance rate is higher than my old W-2 full-time rate for obvious reasons. I also realize that since freelancing "full time" requires lots of administrivia and sales, I would never really be racking up 40 hrs/wk at my $80 rate. (I've been toying with the idea of charging any other clients more, like $100/hr.)
However, I realize that from Company Y's perspective, offering me the security of a 6 month retainer contract should drive my hourly rate down (bulk discount?) since I'd now have way more billable hours and less administrivia. This still has got to be a raise on my old W-2 job for it to be worth my while though, especially due to the lack of benefits and the more complex tax situation.
Now I wish I had originally charged Company Y $100/hr for the initial freelance projects so that I could give them a better deal and charge them $80/hr for this 6 month contract.
Sorry for being so long winded, but I hope you guys get my drift. Essentially, I should be giving them a lower hourly, but I really don't want to.
Is my assumption correct that as far as hourly rates go,
full-time-W-2 < long-term 1099 < short-term-project-based 1099 ?
If so, what might a good negotiation strategy be with Company Y to keep my hourly rate as is, and effectively nix their bulk discount? "You were getting a super low rate on those individual projects!"
Company Y loves my work and has gained new jobs because of it. They want more of my time and have offered me a 6 month contract, paid a fixed monthly rate regardless of hours (they assume 40ish per week). I'd still be working remotely.
Are you sure about this? Anytime I was asked to work "fixed monthly rate" it was a none-too-subtle way of trying to get a lot of "free" hours (effectively a massive rate cut).
I don't know any consulting project where you can just quit at 40 hrs, especially if the client gets a "push" where they need stuff sooner rather than later... the urgency is always theirs, and frequently manufactured rather than "real".
So, if they want you AND want a discount, give them maybe $70/hr for an HOURLY contract over the 6 months. That way they get a discount, and you get protection from overtime and any urgency that may arise.
Anything else and you WILL get hosed. Almost guaranteed.
I'm not from the US so the W-2 and 1099 part is beyond me but I'll address the rest as those issues are pretty universal.
Generally speaking, a rule of thumb is that if you earn $100k per year you should be charging $100/hour or pretty close to it. This is to cover some or all of:
No personal/sick leave;
No paid annual leave;
No bonuses;
No training;
Insurances (health, public liability, professional indemnity, etc);
If you are not contracted to a certain number of hours per week, there is variability in income;
The employer can get rid you much more easily than a full-time employee.
Now this is my experience in Australia and Europe where you actually have quite significant public health care. I might imagine that since you don't in the US, the health insurance costs might drive this even higher so perhaps you should be asking for $120+/hour.
Note: if you're not paying things like professional indemnity insurance or you don't have some sort of legal protection (like operating through a limited liability company) you are playing with fire and I strongly urge you to seek professional advice on setting up a structure and/or obtaining relevant insurances to adequately protect you, your assets and your family if you have one.
Of course you have to balance this all out against the current market conditions, which aren't all that great (but vary from locale to locale).
I like the hourly rate scenario because it's "fair". By that I mean if you work 80 hours one week to get something out then you get paid for it. You just get paid for what you do and that's it. It's simple.
Now employers often don't like it because they can't necessarily predict (and thus budget) the costs.
The next step is to get paid a daily rate. I typically try and resist this but I will go for it in certain situations. If so, you need to define exactly what a day is.
If you work at all do you get paid for a half day? A full day?
Do you need to work a certain number of hours to get paid for the day?
Do you only ever get paid for a full day no matter how many hours you work in that day?
Can you get paid for more than 5 days a week?
Generally for this sort of situation I'll mutliply my hourly rate by 9 basing it on an 8 hour day. You're taking on some of the risk so you need to get paid for that.
Beyond that you can go to weekly and then monthly. They too have the issue of having to define what constitutes a week or month. There are on average 20 or 21 working days a month so multiply your daily rate by 21-25 to get a monthly rate.
As for a negotiation strategy, pretty much use the points listed above. If $120/hour sounds like a lot (to them) point out all the costs involved, which are also costs they're saving. Use your proven track record to your advantage because I can guarantee you that there are few things more catastrophic to a company than incompetent software development.
You could just tell them that the contract is only for 40hrs/wk max, and if they need you to go over that then it will be at your new rate of $100/hr, which may not be a problem if you gave them a discount on the first 40hrs.
Then chalk this up as a lesson learned and for any new clients change your rate. :)
6 months, 40 hours per week = almost 1000 hours, therefore for every dollar you drop from your hourly rate, you'll be discounting them $1000, so a drop of $5-6/hr should be significant enough, IMO.
Have this "discount" got into discussion as of yet? If not, the simplest solution would be just going forward with the implicit assumption of no change in payments -remember, a "discount" is something of a generousity, and not a mandatory.
I'm no expert, but I would explain to Company Y that the original rate WAS the discount. If you can convince them that you were charging the bare minimum all along instead of trying to milk more money out of them, I think they would consider that a positive.
If you were completely cool with knocking a significant percentage off of your rate, I think in the back of their minds they would think you were gaming them in the beginning.
As an analogy, say you go to a car lot. The salesman initially quotes $30,000. You come back with $20,000. He accepts without hesitation. You may actually end up with a good deal, but the salesman comes off as being shady anyway.
They want more of my time and have offered me a 6 month contract, paid a fixed monthly rate regardless of hours (they assume 40ish per week). I'd still be working remotely.
My argument would be:
I'll probably end up working over 40 hours per week. If you'd prefer a 6 month guaranteed contract, paid hourly instead of at a flat rate, we can renegotiate that.
However, I also would say that a 6 month contract is not necessarily "long term" - more "mid term".
So 1: you don't want to lower your rates, 2: they want a fixed rate.
It is likely that the fixed part is more important than the lower part to them. So call them and first talk to them and find out if that is true. If it's about being fixed, or about being cheaper. Second, explain them that you can not lower your rate. Plenty of arguments for that. Good luck!