Modeling an RTS or how Blizzard was able to put together Starcraft 1 & 2? - language-agnostic

Not sure if this question is related to software development, but I hope someone can at least point me in the right direction(no, not that direction..)
I am very curious as to how Blizzard achieve such a balance of strategic/tactic forces in their games? If you look at Starcraft 1 or now 2, each race have unique features that sort of counterpart other unique features of other races and all together create a pretty beautiful(to my mind at least) balance.
Is there some sort of area of mathematics that could help model these things? How do they do it basically?

I don't have full answer but here is what I know. Initially, when game is technically ready the balance is not ideal. When they started first public beta there were holes in balance that they patched very fast. They let players (testers) play as is and captured statistics of % of wins per race and tuned the parameters according to it. When the beta was at the end ratio was almost ideal: 33%/33%/33%.

I've no idea how Blizzard specifically did it - it might have just been through a lot of user testing.
However, the entire field of CS and statistics dedicated to these kinds of problems is simulation. Generally the idea is to construct a model of your system and then generate inputs according to some statistical distribution to try to understand the behaviour of that model. In the case of finding game balance, you would want to show that most sequences of game events led to some kind of equilibrium. This would probably involve some kind of stochastic modeling.

Linear algebra, maybe a little calculus, and a lot of testing.
There's no real mystery to this sort of thing, but people often just don't think about it in the rigorous terms you need to get a system that is fairly well-balanced.
Basically the designers know how quickly you can gather resources (both the best case and the average case), and they know how long it takes to build a unit, and they know roughly how powerful a unit is (eg. by reference to approximations such as damage per second). With this, you can ensure a unit's cost in resources makes sense. And from that, it's possible to compare resource gathering with unit cost to model the strength of a force growing over time. Similarly you can measure a force's capacity for damage over time, and compare the two. If there's a big disparity then they can tweak one or more of the variables to reduce it. Obviously there are many permutations of units but the interesting thing is that you only really need to understand the most powerful permutations - if a player picks a poor combination then it's ok if they lose. You don't want every approach to be equally good, as that would imply a boring game where your decisions are meaningless.
This is, of course, a simplification, but it helps get everything in roughly the right place and ensure that there are no units that are useless. Then testing can hammer down the rough edges and find most of the exploits that remain.

If you've tried SCII you've noticed that Blizzard records the relevant data for each game played on B.Net. They did the same in WC3, and presumably in SC1. With enough games stored, it is possible to get accurate results from statistical analysis. For example, the Protoss should win a third of all match-ups with similarly skilled opponents. If this is not the case, Blizzard can analyze the games where the Protoss won vs the games where they lost, and see what units made the difference. Then they can nerf those units (with a bit of in-house testing), and introduce the change at the next patch. Balancing is a difficult problem - a change that fixes problems in top-level games may break the balance in mid-level games.
Proper testing in a closed beta is impossible - you just can't accumulate enough data. This is why Blizzard do open betas. Even so the real test is the release of the game.

Related

Coding Competition, language agnostic guidelines?

I might be doing a coding competition soon, I was wondering if anyone made one and what where the guidelines/ process.
I'd like to make the competition appealing to all devs, and I m trying to come up with ideas as to how.
the scenario is: There is an event running and we(of the coding competition) will have a room that we can use (either to code or for questions, etc), however, ideally the task for the competition should be assignet and they should eb able to go and do other things, if they are so inclined.
what i wonder is what kind of challenges to give, and most importantly, what is the criteria to "win" teaching and learning good coding standards takes a looong time, and I d like to think that if you ve been coding for longer you ll do things right and quick... but in a competition, you would be cutting corners...
I would really appreciate your input on this
A competition that's appealing to all devs? That sounds... difficult. But if you want to make the competition about problem solving and algorithms, then I am a big fan of the Sphere Online Judge. Basically this is a repository of programming puzzles but you can also become a problem setter and create problems or contests on the site.
It supports a huge number of programming languages, from the "popular" ones to more obscure ones. Programs will typically read from standard in, and write to standard out. The standard judge program will simply diff a program's output with the expected output, but more elaborate judges are possible. You also set a time limit for the execution of submissions, which usually requires programmers to be more clever than brute force.
Winner is whoever solved the most problems. Ties are broken by time of correct submissions, with some time penalty for wrong submissions.
Guidelines
Limit the languages which can be submitted. If you don't, you may get proprietary languages which require a special purchased compiler or some other inconvenience.
Correctness
This is easy. Provide an easy-to-read unit test in all languages you will accept. This will allow simple, automatic testing of submissions, and will guide the interface of the solution.
Challenges
Create a theme. Make it focused, but not too specific as to require certain paradigms or language features. Then develop challenges based around that theme.
Assign points to each challenge. Give more points to more difficult problems. Be sure to review each challenge carefully and have a team attempt them before giving points so you can make a more accurate decision.
As #miorel mentioned in his answer, time limits and memory limits are wonderful. Set a time limit per test per challenge, or at least monitor them and have these metrics contribute to the points given for solving the problem.
You should look at the ACM competition. Each year they have collegiate programming competitions. These are language agnostic. The archive is located here.
http://www.ntnu.edu.tw/acm/
To start improve yourself, you need a project you can work on, a problem you want to solve, something you want to achieve. Without any context and a destination you want to end, you won't be able to learn all the necessary methods and all the connections of a language.
There is a competition, taking place 2 times a year, it is called ludum dare.
It also doesn't matter which language you are writing, you just have to create a game within 48h ( compo, just one person and all assets created by yourself ) and 72h ( jam, a team working together, can purchase assets ). After the competition when every uploaded his game, the voting starts. This will take like 20 days where everybody can upvote your game or you can upvote other peoples' games. There are taking part approximately 3000 people.
Every time the competition starts, there is a voting on 5 days in sequence. Everyday you vote on a set of themes which can be possibly the theme you will have to create a game for. My last competition had the theme "unconventional weapon". After the voting ended, the competition starts and you have to think about a game with (in my case ) an unconventional weapon and start coding a game you like.
This is not about being the best, you should start looking at other peoples projects after the competition ended. You can learn a lot of other people, ways they solve their problem and I am sure you would improve your self every time taking place in such a contest.
It's gonna be hard to designed a coding competition suitable for a wide array of languages, since languages typically serve different purposes. I'd suggest that what you're looking for doesn't exist.

Estimating Flash project hours

I'm trying to estimate the hours required to build a group of 5 simple children's games in Flash. They will include such things as having kids drag and drop healthy food items into a basket; choosing the healthy and unhealthy food items by marking them in some way; etc.
I have no experience building games in Flash, but I have programmed in Flex and Actionscript. How many hours do you estimate for this project?
While your ActionScript background will help, I find Flash to be a VERY different experience from Flex and that proficiency in one environment does not translate well.
Is there a compelling reason not to use Flex? I think you would likely be much more efficient.
That aside, the mechanics of a simple drag and drop game could be put together fairly quickly. There are some good examples of basic drag and drop around. It can be a little tricky to get the mouse coordinates right if it is your first time.
That aside, there are other hidden costs you need to remember. Connecting infrastructure for example. Are the games connected in some way? Is there a running score or persistence that might imply authentication? Is there a story?
Also, If your forte is programming, don't underestimate that challenges of obtaining or creating art and sound assets.
Before you can estimate the time you'll need to break down what the games do. In other words, you'll need to write up very clear and definite requirements. You may even need to write up specifications. Once you've analyzed what the software should do, the estimate will also take a while - one part for example is figuring out whether there's already software that does what you want.
In my opinion, the best way you can possibly estimate a programming project, especially one in a technology you don't understand, would be to apply the Use Case Points methodology. Basically you break the project up into use-cases (what the users are trying to do) and actors (the user types and the system itself) and then list a few team and environment factors (how big an issue is code re-use, how familiar are you with the lanaguage, etc.) Studies have shown that it's more accurate for inexperienced developers than estimating based on features alone.
A google search for "use case points estimate" reveals many useful links. This explanation of the methodology seems to do a good job explaining how it works, though I've not read the entire thing. This worksheet will help when you're ready to start listing points.

Where is the dividing line between complete lack of planning and analysis paralysis?

In my very short time working in the programming field, I've seen two extremes:
Projects where little to no planning was done and thus become maintenence nightmares.
Projects that are perpetually in the planning stages and don't move from there.
It seems like the latter oftentimes happen as a reaction to the former. Where is the happy medium? And more importantly, if a project is moving in one of these directions, what is the best way to move it towards said happy medium?
In my own personal experience, I have found that 'decisions' are my bottle neck.
If this is the case, then :
List all your design options
Pick an option(s) (pick a few if you can't decide on one)
List the risks of the best option(s)
For each risk, brainstorm a solution, then design a conclusive proof of concept and write it.
If your proof of concepts proves it will NOT work, then toss that option, and pick another one.
A 'Proof Of Concept' is a minimal app to prove something. (mine are usually 1-6hrs)
If you have a situation where 2 or more options are equal, give yourself a time limit (like 5 minutes, not 2 months) and make a decision ... any decision, and don't look back.
And trust yourself to be able to deal with any problems you will hit which you did not take into account at design time.
Initial planning should be roughly O(log n) where n is expected total development time.
If you have to push in a week, sketch something on a napkin.
If you have a month, the first day is for initial design.
If you have a year, spend a week.
This does assume that you revisit planning iteratively, and don't just go all commando-style on your codebase, without adult supervision :-).
Analysis paralysis can have many symptoms. One that I have noticed is that same questions are asked each meeting and no resolution is reached. If you can point this out to people that might be able to help them admit that the planning process is stagnating.
If you can, at the start of the project, state that you want to cover a certain percentage of the requirements in the planning stage, say like 80-90%. This way there is a clear goal and you aren't trying for perfection. You can revisit the planning and analysis later just don't let it hold things up.
I think it depends on 2 factors:
The length of the project
Is it a 1 week project?
Is it a 1 year project?
Or somewhere in between?
The risk of the changes being introduced in the project
Are they architectural in nature, potentially affecting a lot of the original code?
Or are you simply adding a new feature?
Obviously, it's a combination of the above 2 factors. It simply doesn't make sense to spend 1 month designing a feature which will take 2 days to implement and is of little risk to the architecture. I'm picturing a matrix here of length/risk/design time tradeoffs.
There was some interesting advice in Code Complete 2, which I am currently in the process of reading. I can't remember the exact wording, so I am paraphrasing here, but it said something along the lines of:
The 2 biggest mistakes you can make in a design are:
1. Attemping to design EVERYTHING (you will fail)
2. Designing NOTHING before implementation
Finding the happy medium between these 2 is the key to successful design and planning.
Great questions - with no absolute answer - this is what makes experience meaningful. Experience including:
how much detail can you actually get from any user/sponsor and from this specific group
how much detail does your current team need (technical/business specific skill levels)
how open are your sponsors/users in helping during the development (how agile of a team do you have - does it include the sponsor/user?)
how good are you are identifying gaps
A big factor is the system being developed - the more 'life' critical the more details you will need (heart monitor compared to a web page).
The more complex, cost/time constrained, life-critical the more up front work - the less the more you can detail out as you do the work (prototype to production)

How to measure usability to get hard data?

There are a few posts on usability but none of them was useful to me.
I need a quantitative measure of usability of some part of an application.
I need to estimate it in hard numbers to be able to compare it with future versions (for e.g. reporting purposes). The simplest way is to count clicks and keystrokes, but this seems too simple (for example is the cost of filling a text field a simple sum of typing all the letters ? - I guess it is more complicated).
I need some mathematical model for that so I can estimate the numbers.
Does anyone know anything about this?
P.S. I don't need links to resources about designing user interfaces. I already have them. What I need is a mathematical apparatus to measure existing applications interface usability in hard numbers.
Thanks in advance.
http://www.techsmith.com/morae.asp
This is what Microsoft used in part when they spent millions redesigning Office 2007 with the ribbon toolbar.
Here is how Office 2007 was analyzed:
http://cs.winona.edu/CSConference/2007proceedings/caty.pdf
Be sure to check out the references at the end of the PDF too, there's a ton of good stuff there. Look up how Microsoft did Office 2007 (regardless of how you feel about it), they spent a ton of money on this stuff.
Your main ideas to approach in this are Effectiveness and Efficiency (and, in some cases, Efficacy). The basic points to remember are outlined on this webpage.
What you really want to look at doing is 'inspection' methods of measuring usability. These are typically more expensive to set up (both in terms of time, and finance), but can yield significant results if done properly. These methods include things like heuristic evaluation, which is simply comparing the system interface, and the usage of the system interface, with your usability heuristics (though, from what you've said above, this probably isn't what you're after).
More suited to your use, however, will be 'testing' methods, whereby you observe users performing tasks on your system. This is partially related to the point of effectiveness and efficiency, but can include various things, such as the "Think Aloud" concept (which works really well in certain circumstances, depending on the software being tested).
Jakob Nielsen has a decent (short) article on his website. There's another one, but it's more related to how to test in order to be representative, rather than how to perform the testing itself.
Consider measuring the time to perform critical tasks (using a new user and an experienced user) and the number of data entry errors for performing those tasks.
First you want to define goals: for example increasing the percentage of users who can complete a certain set of tasks, and reducing the time they need for it.
Then, get two cameras, a few users (5-10) give them a list of tasks to complete and ask them to think out loud. Half of the users should use the "old" system, the rest should use the new one.
Review the tapes, measure the time it took, measure success rates, discuss endlessly about interpretations.
Alternatively, you can develop a system for bucket-testing -- it works the same way, though it makes it far more difficult to find out something new. On the other hand, it's much cheaper, so you can do many more iterations. Of course that's limited to sites you can open to public testing.
That obviously implies you're trying to get comparative data between two designs. I can't think of a way of expressing usability as a value.
You might want to look into the GOMS model (Goals, Operators, Methods, and Selection rules). It is a very difficult research tool to use in my opinion, but it does provide a "mathematical" basis to measure performance in a strictly controlled environment. It is best used with "expert" users. See this very interesting case study of Project Ernestine for New England Telephone operators.
Measuring usability quantitatively is an extremely hard problem. I tackled this as a part of my doctoral work. The short answer is, yes, you can measure it; no, you can't use the results in a vacuum. You have to understand why something took longer or shorter; simply comparing numbers is worse than useless, because it's misleading.
For comparing alternate interfaces it works okay. In a longitudinal study, where users are bringing their past expertise with version 1 into their use of version 2, it's not going to be as useful. You will also need to take into account time to learn the interface, including time to re-understand the interface if the user's been away from it. Finally, if the task is of variable difficulty (and this is the usual case in the real world) then your numbers will be all over the map unless you have some way to factor out this difficulty.
GOMS (mentioned above) is a good method to use during the design phase to get an intuition about whether interface A is better than B at doing a specific task. However, it only addresses error-free performance by expert users, and only measures low-level task execution time. If the user figures out a more efficient way to do their work that you haven't thought of, you won't have a GOMS estimate for it and will have to draft one up.
Some specific measures that you could look into:
Measuring clock time for a standard task is good if you want to know what takes a long time. However, lab tests generally involve test subjects working much harder and concentrating much more than they do in everyday work, so comparing results from the lab to real users is going to be misleading.
Error rate: how often the user makes mistakes or backtracks. Especially if you notice the same sort of error occurring over and over again.
Appearance of workarounds; if your users are working around a feature, or taking a bunch of steps that you think are dumb, it may be a sign that your interface doesn't give the tools to figure out how to solve their problems.
Don't underestimate simply asking users how well they thought things went. Subjective usability is finicky but can be revealing.

Seating plan software recommendations (does such a beast even exist?)

I'm getting married soon and am busy with the seating plan, and am running into the usual issues of who sits where: X and Y must sit together, but A and B cannot stand each other etc.
The numbers I'm dealing with aren't huge (so the manual option will work just fine), but being of the geeky persuasion, I was wondering if there was any software available to do this for me?
Failing an exact match, what should I look for (the problem space, books, reference code) to tweak for my purposes?
I am the developer of PerfectTablePlan. I post here as well as Joel's Business of Software . ;0)
Combinatorial problems, such as seat assignment, are quite nasty algorithmically. NP-hard in fact. The number of ways to seat 60 guests in 60 seats is 60! (60 factorial) and that is more than the number of atoms in the known universe.
PerfectTablePlan allows you to specify that A must sit next to B, but nowhere near C. It uses a genetic algorithm to automatically the assign seats. This works pretty well in practice - it will usually find a decent solution for 100 guests in a few seconds. You might need to make a coffee for 1000+ guests. In practice some drag and drop fine-tuning is also usually required to cope with the vagaries of local customs and family politics (Uncle Bob is a bit deaf, we had better put him nearer the top table).
You can find out a bit more about the genetic algorithm here.
Ps/ The automatic seat assignment is only a small part of creating a good seating plan. See the PerfectTablePlan tour and the tips page for more details.
http://www.perfecttableplan.com/
I believe this is from a guy that usually posts at Joel On Software.
Never tried it though.
Hope it helps.
Try modeling this using GLPK. Integer linear programming is amenable to introducing constraints into graph-based problems with multiple possible outcomes.
My personal favorite is to not assign seating: allow folks to sit wherever they want.
That might lead to [un]intentional cliquishness, but it means you're not having to worry about it.
I expect this isn't a great answer, but you could research flocking behavior
If you take away the random jitters at each step, the flock eventually settles where each member has found its optimum position in relation to the rest of the group.
This sounds like a constraint satisfaction problem. You should probably check out logic programming systems that are also equipped with constraints-solvers. They're usually like prolog, only they are actually declarative for problems that are soluble by their solvers.
Hopefully there is one that has an easy interface from your favourite language, to get the data in and out.
I used a program a while ago that would fit this perfectly... it was a Java App, you could define rules, and it would create test cases that satisfied the rules. The file extension was .als
fact GateRules {
all g:Gate | one g.loc // Gates have 1 Location
I'll keep wracking my brain for the program name.
EDIT: It was Alloy
Now that I think about it, it may not be ideal - the notion of "seats in a fixed configuration" would be a little difficult to model. I used it differently: defining rules (an airport gate is in one location, only one plane is on a runway), and testing pre and post conditions for functions (after I land a plane, can i even have more than one plane on a runway?).