Any good threads related job-interview question? - language-agnostic

When interviewing graduates I usually ask them questions about data structures, algorithms and complexity theory. I would really like to ask a question that will enable them to show their familiarity with multi-threaded concepts, without dwelling into language specific issues.
Any good questions? The only question I could think of is how to write a Singleton that supports multi-threaded access.

I find the classic "write me a consumer-producer queue" question to be quite good. You can talk about synchronization in a handwavy way beforehand for five minutes or so (e.g. start with "What does Object.wait() do? What other methods on Object is it related to? Can you give me an example of when you might use these? What other concurrency techniques might you use in practice [because really, it's quite rare that actually using the wait/notify primitives is the best approach]?"). Make sure the candidate addresses (or at least makes clear he is aware of) both atomicity ("missed updates") and volatility (visibility of the new value on other threads)
Then after you've had a chat about the theory of these, get them to spend a few minutes actually writing the code for a primitive producer-consumer queue. This should be straightforward to anyone who actually understands what they were talking about above, yet it will weed out those who can "talk the talk" but don't actually understand it in practice (arguably the most dangerous group).
What I like about these mini-coding exercises, is that they're often easy to extend. For instance, if the candidate completes the task easily, you can ask how they would extend it for situation XXX - invent requirements that you know will push the limits of the noddy solution you asked for. This not only lets you tailor the depth of questions you're asking but gives some insight into how well the candidate handles clarification of requirements, and modifications of existing design (which is pretty important in this industry).

Here you can find some topics to discuss:
threads implementation ( kernel vs user space)
thread local storage
synchronization primitives
deadlocks, livelocks

Differences between mutex and
semaphore.
Use of condition variables.
When not to use threads. (eg. IO multiplexing)

Talk with them about a popular, but not well-known topic, where thread handling is essential.
I recommend you, build a web server with them, of course, only on paper or just in words. The result should look something like this: there is a main thread, it's listening on a socket. When something arrives, it passes the socket into the pool, then this thread returns back to socket listening. The pool has fixed number of slots. The request processing threads are dedicated to get job from the pool. Find out, what's better, if the threads are checking the pool concurrently, or the listner main thread selects a free slot/thread for the new incoming request. Try to write a small pseudocode, or a graph for both side of the pool handling.
Let's introduce a small application: page counter, which tells that how many page request has been made since server startup. Don't tell them that the counter must be protected against concurrent modification, let them to find it out how to do this with mutexes or synchronization or whatsoever. Maybe you could skip the web server part, the page counter app is easier to specify.
Another example is a chat, with 2+ clients and a server, find out, how to solve the problem, that all the messages should arrive in the same order for all clients. Or reflex game: the server waits for 1..5 secs random, then says "peek-a-boo", and the player wins who presses space key first. Specify it with 2 player, then try to expand it to N players.
Also, be aware of NPPs. NPP stands: "non-programming programmer". There are dudes, who can talk about programming issues, they know all the 3/4-letter abbrevations (there're lot in the Java world, EJB, JSP, XSLT, and my favourite: POJO, which means Pure Old Java Objects, lol), they understand and modify codes, or make similar programs from a base, but they fail even with small problems, it it has to do it theirselves, e.g. finding the nearest element to a base in an array. Sometimes it takes months, until it turns out. They performs well at interviews, because they prepare for it. Maybe they don't even known, that they're NPPs, this is a known effect: http://en.wikipedia.org/wiki/Dunning-Kruger_effect
It's hard to recognize the opposite dudes, who have not heard about trendy libraries or patterns, but they can learn it even at the job interview. (Personal remark: my last interview was in 1999, and it seems that I will not do interview anymore. I have never heard of dynamic web pages before, but I've figured out the term "session" during the interview, the question was that how to build a simple hanging man web app. I was hired.)

Related

AS3 - Does the amount of code in an object matter for multiple instances?

What's better?
1000 objects with 500 lines of code each
vs
1000 objects with 30 lines of code each that delegate to one manager with 500 lines of code.
Context:
I have signals in my game. Dozens and dozens of them. I like that they have better performance than the native Flash Event.
I figured that as I was going to have so many, they should be lightweight, so each signal only stores a few variables and three methods.
Variables: head, tail, isStepping, hasColdNodes, signalMgr
Methods: addListener, removeListener, dispatch
But all these signals delegate the heavywork to the signalMgr, as in:
signalMgr.addListener(this, listener, removeOnFirstCallback);
That manager handles all the doublylinked list stuff, and has much more code than a signal.
So, is it correct to think this way?
I figure that if I had all the management code in the signal, that would be repeated in memory every time I instance one.
In the context of your question this is pretty much irrelevant and both cases should not produce much difference.
From what you say it seems that you assume many things but did not actually check any. For instance when you say: "I like that they have better performance than the native Flash Event." I can only assume that you did read that somewhere but never try to verify it by yourself. There's only a few cases where using the signal system can make a tiny bit of difference, in most cases they don't bring much, in some cases they make things worse. In the context of Flash development signals do not bring anything more than simple convenience. Flash is an event driven system that cannot be turned off so using signals with it means using event + signals, not using signals alone. In the case of custom events using delegation is much more efficient and easier to use and doesn't require any object creation.
But the real answer to the question is even more simple: There's no point optimizing something that you don't know needs optimization. Even worse different OS will produce different optimization needs so anyone trying to answer a general optimization question can only fail or pretend to know.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. (c) DonaldKnuth
The ultimate goal of any programmer is to write clean code. So you should write clean code with a only a few thoughts of optimizations.

how to make a private game server?

i have always wanted to make a private server but i don't know how i would do this.
i know how a private server works, the game sends data packets to the server. the server will take the data and process it and send data to the other games connected.
my questions are,
how do you edit the game so it will go to your server/change game data.
how do you find what packets do what.
the game will be something like WOW, i have not desided yet.
If you are hoping to embark on creating your own MMORPG then you have a huge task ahead of you, and unfortunately to put it nicely you are probably being too ambitious especially if you are asking these sorts of questions.
You should probably read up on client server architecture.
Also, in answer to your questions about the structure of the data being sent and how it is interpreted, well, that's 100% up to the people that design the system. You will want to simulate the entire game on the server(s) and don't trust the clients at all.
For something as complex as a MMORPG it is really important to create a solid design for the system before anything else, this is very important.
Just to be clear your intent is to create an emulated MMO server to the effect of WOW?
That's not really a trivial task and carries with it its own ethical implications.
Just to get started will require a ton of research, inspection, decoding, an extreme attention to detail.
If you are serious about it, then I would suggest looking up networking tools that can help you inspect traffic across the network and creating a scientific process for operation inspection.
Again, it should be noted this is by no means a trivial task.
This will be fairly difficult as you do not have the communicaton protocol specification for the game's client/server communication.
If you want to start this, then create a server that is simply a pass through. That is, all client requests are forwarded to the particular server. Once you have generated a large enough sample size of packets to study, then you can begin to dissect the meaning of each byte (possibly). Of course, if the packets are encrypted in any way (even a simple XOR encryption) then you will have an even harder time trying to figure out what each byte means. You should capture a sample set using two clients running sniffers so you can see what happens when one client does something and it needs to be sent to all clients.
But if I were you, I would just abandon the idea and work on something else. My two cents..
If you'd like an inside look at how games do networking, there's always Ryzom, which went open-source earlier this year. If you're creating your own MMO you can begin right there, and if you're looking to reverse-engineer one you can practice with your own client and server.

The Implications of Modern Day Software Development Abstractions [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am currently doing a dissertation about the implications or dangers that today's software development practices or teachings may have on the long term effects of programming.
Just to make it clear: I am not attacking the use abstractions in programming. Every programmer knows that abstractions are the bases for modularity.
What I want to investigate with this dissertation are the positive and negative effects abstractions can have in software development. As regards the positive, I am sure that I can find many sources that can confirm this. But what about the negative effects of abstractions? Do you have any stories to share that talk about when certain abstractions failed on you?
The main concern is that many programmers today are programming against abstractions without having the faintest idea of what the abstraction is doing under-the-covers. This may very well lead to bugs and bad design. So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
Taking a simple example from Joel's Back to Basics, C's strcat:
void strcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
}
The above function hosts the issue that if you are doing string concatenation, the function is always starting from the beginning of the dest pointer to find the null terminator character, whereas if you write the function as follows, you will return a pointer to where the concatenated string is, which in turn allows you to pass this new pointer to the concatenation function as the *dest parameter:
char* mystrcat( char* dest, char* src )
{
while (*dest) dest++;
while (*dest++ = *src++);
return --dest;
}
Now this is obviously a very simple as regards abstractions, but it is the same concept I shall be investigating.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Can you please give your opinions and your says as regards this subject?
Thank you for your time and I appreciate every comment.
First of all, abstractions are inevitable because they help us to deal with the mind-blowing complexity of things.
Abstractions are also inevitable because it is more and more required of an individual to undertake more tasks or even complete projects. To address the problem, one uses libraries which wrap lower-level concepts and expose more complex behavior.
Naturally, a developer has less and less time to know the intrinsics of the things. The latest concern I heard about on SO pages is starting to learn JavaScript with jQuery library, ignoring the raw JavaScript at all.
The issue is about the balance between:
Know the little tiniest details of some technology and be a master of it, but at the same time being unable to work with anything else.
Superficial knowledge of a wide variety of technologies and tools which however proves sufficient for common everyday tasks which allows an individual to perform in multiple areas possibly covering all sides of some (moderately big) project.
Take your pick.
Some work requires the one, another position requires the other.
So, in you're opinion, how important is it that programmers actually know what is going below the abstractions?
It would be nice if people knew what is happening behind the scenes. This knowledge comes with time and practice, up to a certain degree. Depends on what kind of tasks you have. You certainly shouldn't blame people for not knowing everything. If you wish a person to be able to perform in a variety of fields, it is inevitable he won't have time to cover each up to the last bit.
What is essential, is the knowledge of the basic building blocks. Data structures, algorithms, complexity. That should provide a basis for everything else.
Knowing tiniest details of some particular technology is good, but not essential. Anyway, you can't learn them all. They're too many and they keep coming.
Finally, what do you think about the issue that schools are preferring to teach Java instead of C and Lisp ?
Schools shouldn't be teaching programming languages at all. They're to teach basics of theoretical and practical CS, social skills, communication, team work. To cover a vast variety of topics and problems to provide a wide angle view for their graduates. This will help them to find their way. Whatever they need to know in details, they'll do it on their own.
An example where abstraction has failed:
In this case, a piece of software was needed to communicate to many different third party data processors. The communication was done through various messaging protocols; the transport method/protocol is not important in this case. Just assume everyone communicated through messaging.
The idea was to abstract the features of each of these third parties into a single, unified message format. It seemed relatively straightforward because each of the third parties performed a similar service. The problem was that some third parties used different terms to explain similar features. It was also found that some third parties had additional features that other third parties did not have.
The designers of the abstraction did not see through the difference of third party terms nor did they think it was reasonable to limit the scope of the unified features to only support the common features of the third parties. Instead, a single, monolithic message schema was developed to support any and all features of the third parties considered at the time. In what was probably considered a future-proofing move, they added a means of also passing an infinite number of name/value pairs along with the monolithic message in case there were future data elements that the monolithic message could not handle.
Early on, it became clear that changing the monolithic message was going to be difficult due to so many people using it in mission critical systems. The use of the name/value pairs increased. Each name that could be used was documented inside a large spreadsheet, and developers were required to consult the spreadsheet to avoid duplication of name value function. The list got so large, however, that it was found that there were frequently collisions in purposes of name values.
The majority of the monolithic message's fields now have no purpose and are kept mainly for backwards compatibility. There are name values that can be used to replace fields in the monolithic message. The majority of the interfacing is now done through the name/value pairs. In cases where the client is intending to communicate with more than one third party, each client needs to reconcile the name values available for each third party. It would be almost simpler to interface directly to the third party themselves.
I believe this illustrates that, from a consumer of the monolithic message perspective, that it is important that developers of the consuming code not know what is happening under the covers. If the designers had considered that the consumers of the monolithic message should not have to understand the abstraction in great detail, the monolithic message and it's associated name/value pairs might never have happened. Documenting the abstraction with assertions regarding input and expected output would make life so much simpler.
As for colleges not teaching C and Lisp....they are cheating the students. You get a better understanding of what is going on with the machine and OS with C. You get a bit of a different perspective on processing data and approaching problems with Lisp. I have used some of the ideas I learned using Lisp in programs written in C, C++, .Net, and Java. Learning Java after knowing even just C is not very difficult. The OO part is really not programming language specific, so perhaps using Java for that is acceptable.
An understanding of fundamentals of algorithms (e.g. time complexity) and some knowledge about the metal is essential to designing/writing smells-good code.
I would suggest, though, that just as important is education in modern abstractions and profiling. I feel that modern abstractions make me so much more productive than I would be without them that they are at least as important as good fundamentals, if not more so.
An important element that lacked in my education was the use of profilers. When used routinely and correctly, profilers can help mitigate problems with poor fundamentals.
Since you quote Joel Spolsky, I take it your aware of his "Law of Leaky Abstractions"? I'll mention it for future readers. http://www.joelonsoftware.com/articles/LeakyAbstractions.html
Green & Blackwell's Ironies of Abstractions talks a bit about the effort of learning the abstraction. http://homepage.ntlworld.com/greenery/workStuff/Papers/index.html
The term "astronaut architecture" is a reaction to over-abstraction.
I know I certainly curse abstraction when I haven't touched Java or C# in a while and i want to write to a file, but have to instance a Stream...Writer...Adaptor....Handler....
Also, Patterns, as in Gang Of Four. Seemed great when I first read about them in the mid-90's, but can never remember factory, facade, interface, helper, worker, flyweight....

How to make sure that your code is secure?

I am a programmer. I have about 5 years experience of programming in different kind of languages. I was concerning about my code speed, about optimizing the memory that uses my code, and about good coding style and so on. But have never thought how secure my code is. So I have disassembled my code to see what can do a hacker. Would it be easy to crack my code?
And I saw that it is! It is very easy, because I was storing
serial number as a string
encryption-decryption codes as well
So if someone has the minimal knowledge of assembler he/she can just simple dissembler and after 10-20 minutes of debugging my code is cracked!!! Even it could be done by opening the exe with notepad I guess! :-)
So what I am asking are the following:
Where I should store that kind of secure information’s?
What are the common strategies of delivering a secure code?
First thing you must realize is that you'll never prevent a determined reverser from cracking any protection schemes because anything that the code can do, the reverser will eventually find out how to replicate it. The only way you can achieve any sort of reliable protection is to have the shipped program be nothing more than a dumb client and have the brunt of the software on some server the reverser has no access to.
With that out of the way, you can certainly make it harder for a would be reverser to break your protections. Obfuscation is the sort of first step in achieving this. I have no experience using obfuscators but I'm sure you can find some suggestions for some on SO. Also if you're using a lower level language like C/C++, simply compiling the code with full optimization and stripping all debugging symbols gets you a decent amount of obfuscation.
I read this article a few years ago, but I still think it's techniques hold up today. It's one of the developers of a video game called Spyro talking about the set of techniques they used to prevent piracy. They claim it wasn't until 3 months after the release that a cracked version became available, which is fairly impressive.
If you are concerned about piracy, then there are many avenues you can take. Making the code security tighter (obfuscation, license codes, binding the software to a particular PC, hardware/dongle protection, etc) is one, but it's worth bearing in mind that every piece of software can be cracked if someone sufficiently talented can be bothered.
Another approach is to consider the pricing model for your software. If you charge $1000 a copy, then there is a big incentive for someone to have a go at cracking it. If you only charge $5 then why should anyone bother to crack it?
So what is needed is a balance. Even the most basic protection will stop ordinary people making casual copies. Beyond that, simple techniques (obfuscation and license codes) and a sensible pricing strategy will hold most would-be crackers at bay by making it not worth the bother of cracking. After that, you start getting into ever more sophisticated techniques (dongles/CDs needing to be present to run the software, only being able to run the software after logging on to an online licensing system) that take a lot of effort/cost to implement and significantly increase the risk of annoying genuine customers (remember how annoyed everyone got when they bought half life but it wouldn't let them play the game?) - unless you have a popular mainstream product (i.e. a huge revenue stream to protect), there probably isn't much point going to that much effort.
Make it web app.
It will generally not be well-protected unless there's an external service doing the checking that you are in control of - and that service can still be spoofed by those who really wants to "crack" it. Instead, trust the customer and provide only minimal copyright protection. I'm sure there was an article or podcast about this by Joel Spolsky somewhere... here's another related SO question.
I have no idea if it will help but Windows provides (since 2000) a mechanism to retrieve and store encrypted information and you can also salt this storage on a per-application basis if needed: Data Protection API (DPAPI)
This is on a machine or a user level but storing serials and perhaps some keys using it might be better than having them hidden in the application?
What sort of secure are you talking about?
Secure from the perspective that you are guarding your users data well? If so, study some real cryptography and utilize Existing libraries to encrypt your data. The win32 API is pretty good for this.
But if you're talking about stopping a cracker from stealing your application? There are many methods, but just give up. They slow crackers down, they don't stop them.
Look at How to hide a string in binary code? question
First you have to define what your code should be secure against, being secure as such is meaningless.
You seem to be worried about reverse engineering and users generating license codes without paying, though you don't say so. To make this harder you can obfuscate your code and key information in various ways. There area also techniques to make the use of debuggers harder, to prevent the reverse engineer from stepping through the code and seeing the information in clear.
But this only makes reverse engineering somewhat harder, not impossible
Another common security threat is execution of unwanted code, for example via buffer overflows.
A simple technique for doing this is to xor over all your code and xor back when you need it... but this needs an innate knowledge of assembly... I'm not sure, but you could try this:
void (*encryptionFunctn)(void);
void hideEncryptnFunctn(void)
{
volatile char * i;
while(*i!=0xC0) // 0xC0 is the opcode for ret
{
*i++^=0x45; // or any other code
}
}
To prevent against hackers viewing your code, you should use an obfuscator. An obfuscator will use various techniques which make it extremely difficult to make sense of the obfuscated code. Some techniques used are string encryption, symbol renaming, control flow obfuscation, etc. Check out Crypto Obfuscator which additionally also has external method call hiding, Anti-Reflector, Anti-Debugging, etc
The goal is to erect as many obstacles as possible in the path of a would-be hacker.

how are serial generators / cracks developed?

I mean, I always was wondered about how the hell somebody can develop algorithms to break/cheat the constraints of legal use in many shareware programs out there.
Just for curiosity.
Apart from being illegal, it's a very complex task.
Speaking just at a teoretical level the common way is to disassemble the program to crack and try to find where the key or the serialcode is checked.
Easier said than done since any serious protection scheme will check values in multiple places and also will derive critical information from the serial key for later use so that when you think you guessed it, the program will crash.
To create a crack you have to identify all the points where a check is done and modify the assembly code appropriately (often inverting a conditional jump or storing costants into memory locations).
To create a keygen you have to understand the algorithm and write a program to re-do the exact same calculation (I remember an old version of MS Office whose serial had a very simple rule, the sum of the digit should have been a multiple of 7, so writing the keygen was rather trivial).
Both activities requires you to follow the execution of the application into a debugger and try to figure out what's happening. And you need to know the low level API of your Operating System.
Some heavily protected application have the code encrypted so that the file can't be disassembled. It is decrypted when loaded into memory but then they refuse to start if they detect that an in-memory debugger has started,
In essence it's something that requires a very deep knowledge, ingenuity and a lot of time! Oh, did I mention that is illegal in most countries?
If you want to know more, Google for the +ORC Cracking Tutorials they are very old and probably useless nowdays but will give you a good idea of what it means.
Anyway, a very good reason to know all this is if you want to write your own protection scheme.
The bad guys search for the key-check code using a disassembler. This is relative easy if you know how to do this.
Afterwards you translate the key-checking code to C or another language (this step is optional). Reversing the process of key-checking gives you a key-generator.
If you know assembler it takes roughly a weekend to learn how to do this. I've done it just some years ago (never released anything though. It was just research for my game-development job. To write a hard to crack key you have to understand how people approach cracking).
Nils's post deals with key generators. For cracks, usually you find a branch point and invert (or remove the condition) the logic. For example, you'll test to see if the software is registered, and the test may return zero if so, and then jump accordingly. You can change the "jump if equals zero (je)" to "jump if not-equals zero (jne)" by modifying a single byte. Or you can write no-operations over various portions of the code that do things that you don't want to do.
Compiled programs can be disassembled and with enough time, determined people can develop binary patches. A crack is simply a binary patch to get the program to behave differently.
First, most copy-protection schemes aren't terribly well advanced, which is why you don't see a lot of people rolling their own these days.
There are a few methods used to do this. You can step through the code in a debugger, which does generally require a decent knowledge of assembly. Using that you can get an idea of where in the program copy protection/keygen methods are called. With that, you can use a disassembler like IDA Pro to analyze the code more closely and try to understand what is going on, and how you can bypass it. I've cracked time-limited Betas before by inserting NOOP instructions over the date-check.
It really just comes down to a good understanding of software and a basic understanding of assembly. Hak5 did a two-part series on the first two episodes this season on kind of the basics of reverse engineering and cracking. It's really basic, but it's probably exactly what you're looking for.
A would-be cracker disassembles the program and looks for the "copy protection" bits, specifically for the algorithm that determines if a serial number is valid. From that code, you can often see what pattern of bits is required to unlock the functionality, and then write a generator to create numbers with those patterns.
Another alternative is to look for functions that return "true" if the serial number is valid and "false" if it's not, then develop a binary patch so that the function always returns "true".
Everything else is largely a variant on those two ideas. Copy protection is always breakable by definition - at some point you have to end up with executable code or the processor couldn't run it.
The serial number you can just extract the algorithm and start throwing "Guesses" at it and look for a positive response. Computers are powerful, usually only takes a little while before it starts spitting out hits.
As for hacking, I used to be able to step through programs at a high level and look for a point where it stopped working. Then you go back to the last "Call" that succeeded and step into it, then repeat. Back then, the copy protection was usually writing to the disk and seeing if a subsequent read succeeded (If so, the copy protection failed because they used to burn part of the floppy with a laser so it couldn't be written to).
Then it was just a matter of finding the right call and hardcoding the correct return value from that call.
I'm sure it's still similar, but they go through a lot of effort to hide the location of the call. Last one I tried I gave up because it kept loading code over the code I was single-stepping through, and I'm sure it's gotten lots more complicated since then.
I wonder why they don't just distribute personalized binaries, where the name of the owner is stored somewhere (encrypted and obfuscated) in the binary or better distributed over the whole binary.. AFAIK Apple is doing this with the Music files from the iTunes store, however there it's far too easy, to remove the name from the files.
I assume each crack is different, but I would guess in most cases somebody spends
a lot of time in the debugger tracing the application in question.
The serial generator takes that one step further by analyzing the algorithm that
checks the serial number for validity and reverse engineers it.