Should I change code to make it more testable? - language-agnostic

I often find myself changing my code to make it more testable, I always wonder whether this is a good idea or not. Some of the things I find myself doing are:
Adding setters just so I can set an internal object to a mock.
Adding getters for internal maps/lists so I can check the internal state of the object has changed after performing some external action.
Wrapping concrete system classes and creating a new interface so I can mock them. For example, File classes can be hard to mock - so I'll create a new interface FileInterface and WrappedFile which extends it and then use the FileInterface instead of File.

Changing your code to make it more testable can be a good thing, but only if it makes your code itself better. Refactoring for testability can make your code better independent of the test suite's needs. Those are good changes.
Of your three examples only #3 is a really good one; often those new interfaces will make your code more flexible for regular use later. #1 is usually addressed for testing via dependency injection, which in my mind makes code needlessly more complicated but does at least make it easier to test. #2 sounds like a bad idea in general.

It is perfectly fine and even recommended to change your code to make it more testable. Here is a list of 10 things that make code hard to test.
I think your third is really ok, but I'm not too fond of the first and the second. If you just open your class internals with getters and setters, then you're giving up encapsulation completely. Depending on your language, there are ways to open visibility of some parameters to test. But what I actually do (which opens encapsulation a little less) is to make the fields I want to check protected (when dependency injection doesn't solve the problem).
Then, on the test project, I inherit the class, and create a "more powerful one", where I can check the internals, but I change nothing on the implementation, and use this class in the tests.
Finally, changing your code to have dependency injection and inversion of control is also highly recommended, as it makes your code easier to test AND more readable and maintainable.
Though changing is ok, the best things to do is to TDD. It produces testable code naturally, once the tests are written first.

It's a trade-off..
You want a slim, streamlined API or a bloated more complicated, but easilier tested one.
Not the answer you wanted to hear, I know :)

Seems reasonable. Some things don't need checking though; I'd suggest checking to see if adding to a list worked is a little useless. But do whatever you feel comfortable with.

Ideally, every class you design should test itself, so you don't need to change the public interface. But you are talking about legacy code, so I think that changing code is reasonable only when the public impact isn't much noticeable. I would prefer to add a static inner class to test instead of bloat the interface of the tested class.

Related

Actionscript OOP multiple method call architecture issue

I have a class: DatabaseService.as This class creates a local sqlite connection and creates tables if they do not exist. Connection link will be used by several other classes. Some classes will be called on startup others on user interaction.
"DatabaseService" class dispatches event when database connection is opened. Other classes initialise "DatabaseService" class and awaits for "DatabaseReadyEvent".
This works great but what can I do when I need to call a function/method from the same class several times?
Example:
I create an instance of "PrefService" class in mxml component. "PrefService" creates a new "DatabaseService" class in it's constructor. It then waits for "DatabaseReadyEvent" and executes sql query(this works fine). but then I also need to call "addDir" method (and few others) in "PrefService" class and the sqlConnection property is not set yet causing an error. How can I deal with this? I am new to OOP so I am probably missing something quite simple...
What I've tried / My ideas:
I could check if if sqlConnection exists in "PrefService" class but I think this would be poor practice and still require a delay mechanism of some sort.
I could also create a new instance of "DatabaseService" class for each method and add a new event listener but this would be very cumbersome with 2 functions for each method call not to mention events.
What is the best option in this scenario?
The hate for Singleton is well-deserved. I'd suggest not ever getting in the habit of using it, so you don't have to break that habit when you find how horrible it is to maintain.
Your biggest mistake is having your View creating and executing your service. Unfortunately, this is encouraged by how the FB service generation code works. What you want, instead, is something more like MVCS (Model-View-Control-Service) of the type encouraged by Frameworks like Robotlegs.
To walk through how to go from a tightly-coupled architecture to a loosely-coupled one, start with this example. Note that the Service is a static Class, which pretty much has all the issues as a Singleton as far as encouraging tight coupling. Even though there is only one Class using the Service, imagine what would happen if you have a large project where tens or hundreds of Classes are referencing it. Now imagine something needs to change. Ick.
Now look at the project, refactored so that the View is simply generating an Event that results in calling the service. The service is still static, but in this
case there is exactly one thing that knows about it (Mate), so if you want to make that not static or sometimes use a different service, you easily can, now.
In fact, you can change things around so easily that this is the project, refactored to use Robotlegs. You don't necessarily have to use a Framework, as I did--you can see that the basic structure involved in the core Classes doesn't care a bit about how the Event is being handled or how the data gets into the Views. If you're not comfortable using a Framework, handle it your own way. But Frameworks have been around a while, and they've worked out a lot of issues you haven't thought of yet.
It's tricky to advise without seeing any code, but it might be worth considering making the DatabaseService class a Singleton and initialising it (and the database connection) once as part of your start-up routine (ie. before the classes which use it are instantiated). This would ensure that the classes which use the DatabaseService all share a single connection link to the database, and that the link is available when they come to use it.
Singletons in ActionScript generate a fair bit of debate because in other languages the pattern relies on the ability to set the access modifier of the class constructor as private (which you cannot do in ActionScript 3.0). However, you could choose from a couple of approaches detailed here.
Also, Singletons in general generate a fair bit of debate which might be worth understanding before you use one in anger (since you state you are new to OOP I am assuming you have not done so before).

Interfaces vs Public Class Members

I've noticed that some programmers like to make interfaces for just about all their classes. I like interfaces for certain things (such as checking if an object supports a certain behavior and then having an interface for that behavior) but overuse of interfaces can sometimes bloat the code. When I declare methods or properties as public I'd expect people to just use my concrete classes and I don't really understand the need to create interfaces on top of that.
I'd like to hear your take on interfaces. When do you use them and for what purposes?
Thank you.
Applying any kind of design pattern or idea without thinking, just because somebody told you it's good practice, is a bad idea.
That ofcourse includes creating a separate interface for each and every class you create. You should at least be able to give a good reason for every design decision, and "because Joe says it's good practice" is not a good enough reason.
Interfaces are good for decoupling the interface of some unit of code from its implementation. A reason to create an interface is because you foresee that there might be multiple implementations of it in the future. It can also help with unit testing; you can make a mock implementation of the services that the unit you want to test depends on, and plug the mock implementations in instead of "the real thing" for testing.
Interfaces are a powerful tool for abstraction. With them, you can more freely substitute (for example) test classes and thereby decouple your code. They are also a way to narrow the scope of your code; you probably don't need the full feature set of a given class in a particular place - exactly what features do you need? That's a client-focused way of thinking about interfaces.
Unit tests.
With an interface describing all class methods and properties it is within the reach of a click to create a mock-up class to simulate behavior that is not within the scope of said test.
It's all about expecting and preparing for change.
One approach that some use (and I'm not necessarily advocating it)
is to create an IThing and a ThingFactory.
All code will reference IThing (instead of ConcreteThing).
All object creation can be done via the Factory Method.
ThingFactory.CreateThing(some params).
So, today we only have AmericanConcreteThing. And the possibility is that we may never need another. However, if experience has taught me anything, it is that we will ALWAYS need another.
You may not need EuropeanThing, but TexasAmericanThing is a distinct possibility.
So, In order to minimize the impact on my code, I can change the creational line to:
ThingFactory.CreateThing( Account )
and Create my class TexasAmericanThing : IThing.
Other than building the class, the only change is to the ThingFactory, which will require a change from
public static IThing CreateThing(Account a)
{
return new AmericanThing();
}
to
public static IThing CreateThing(Account a)
{
if (a.State == State.TEXAS) return new TexasAmericanThing();
return new AmericanThing();
}
I've seen plenty of mindless Interfaces myself. However, when used intelligently, they can save the day. You should use Interfaces for decoupling two components or two layers of an application. This can enable you to easily plug-in varying implementations of the interface without affecting the client, or simply insulate the client from constant changes to the implementation, as long as you stay true to the contract of the interface. This can make the code more maintainable in the long term and can save the effort of refactoring later.
However, overly aggressive decoupling can make for non-intuitive code. It's overuse can lead to nuisance. You should carefully identify the cohesive parts of your application and the boundaries between them and use interfaces there. Another benefit of using Interfaces between such parts is that they can be developed in parallel and tested independently using mock implementations of the interfaces they use.
OTOH, having client code access public member methods directly is perfectly okay if you really don't foresee any changes to the class that might also necessitate changes in the client. In any case, however, having public member fields I think is not good. This is extremely tight coupling! You are basically exposing the architecture of your class and making the client code dependent on it. Tomorrow if you realize that another data structure for a particular field will perform better, you can't change it without also changing the client code.
I primarily use interfaces for IoC to enable unit testing.
On the one hand, this could be interpreted as premature generalization. On the other hand, using interfaces as a rule helps you write code that is more easily composable and hence testable. I think the latter wins out in many cases.
I like interfaces:
* to define a contract between parts/modules/subsystems or 3rd party systems
* when there are exchangeable states or algorithms (state/strategy)

Subclassing a test subject for Junit testing

I want to test validation logic in a legacy class. The class uses a method to load effective dates from a config file.
I have written a subclass of the class in question and overridden the config method so I can run my unit test against the subclass with any combination of effective dates.
Is this an appropriate strategy? It strikes me as a clean technique for testing code that you don't want to mess with.
I like it, its the most simple and straight forward way to get this done. And since it is a legacy class, it will not change anymore, so you don't run danger of bumping into the fragile base class problem neither.
It seems to be an appropriate strategy to me. Ofcourse with this override you won't
be able to test the code (in the original class) that loads the config data, but if you have other tests to cover this sceario then I think the approach you outlined is fine.

Developing to an interface with TDD

I'm a big fan of TDD and use it for the vast majority of my development these days. One situation I run into somewhat frequently, though, and have never found what I thought was a "good" answer for, is something like the following (contrived) example.
Suppose I have an interface, like this (writing in Java, but really, this applies to any OO language):
public interface PathFinder {
GraphNode[] getShortestPath(GraphNode start, GraphNode goal);
int getShortestPathLength(GraphNode start, GraphNode goal);
}
Now, suppose I want to create three implementations of this interface. Let's call them DijkstraPathFinder, DepthFirstPathFinder, and AStarPathFinder.
The question is, how do I develop these three implementations using TDD? Their public interface is going to be the same, and, presumably, I would write the same tests for each, since the results of getShortestPath() and getShortestPathLength() should be consistent among all three implementations.
My choices seem to be:
Write one set of tests against PathFinder as I code the first implementation. Then write the other two implementations "blind" and make sure they pass the PathFinder tests. This doesn't seem right because I'm not using TDD to develop the second two implementation classes.
Develop each implementation class in a test-first manner. This doesn't seem right because I would be writing the same tests for each class.
Combine the two techniques above; now I have a set of tests against the interface and a set of tests against each implementation class, which is nice, but the tests are all the same, which isn't nice.
This seems like a fairly common situation, especially when implementing a Strategy pattern, and of course the differences between implementations might be more than just time complexity. How do others handle this situation? Is there a pattern for test-first development against an interface that I'm not aware of?
You write interface tests to exercise the interface, and you write more detailed tests for the actual implementations. Interface-based design talks a bit about the fact that your unit tests should form a kind of "contract" specification for that interface. Maybe when Spec# comes out, there'll be a langugage supported way to do this.
In this particular case, which is a strict strategy implementation, the interface tests are enough. In other cases, where an interface is a subset of the implementation's functionality, you would have tests for both the interface and the implementation. Think of a class which implements 3 interfaces, for example.
EDIT: This is useful so that when you add another implementation of the interface down the road, you already have tests for verifying that the class implements the contract of the interface correctly. This can work for something as specific as ISortingStrategy to something as wide-ranging as IDisposable.
there is nothing wrong with writing tests against the interface, and reusing them for each implementation, for example -
public class TestPathFinder : TestClass
{
public IPathFinder _pathFinder;
public IGraphNode _startNode;
public IGraphNode _goalNode;
public TestPathFinder() : this(null,null,null) { }
public TestPathFinder(IPathFinder ipf,
IGraphNode start, IGraphNode goal) : base()
{
_pathFinder = ipf;
_startNode = start;
_goalNode = goal;
}
}
TestPathFinder tpfDijkstra = new TestPathFinder(
new DijkstraPathFinder(), n1, nN);
tpfDijkstra.RunTests();
//etc. - factory optional
I would argue that this is the least effort solution, which is very much in line with Agile/TDD principles.
I would have no problem going with option 1, and keep in mind that refactoring is part of TDD and it's usually during a refactoring phase that you move to a design pattern such as strategy, so I wouldn't feel bad about doing that w/o writing new tests.
If you wanted to test the implementation-specific details of each PathFinder impl, you might consider passing mock GraphNodes which are somehow capable of helping to assert the Dijkstra-ness or DepthFirst-ness, etc, of the implementation. (Perhaps these mock GraphNodes could record how they are traversed, or somehow measure performance.) Maybe this is testing overkill, but then again if you know your system needs these three distinct strategies for some reason, it'd probably be good to have tests to demonstrate why - otherwise why not just pick one implementation and throw the others away?
I don't mind reusing test code as a template for new tests that have similar functionality. Depending on the particular class under test, you may have to rework them with different mock objects and expectations. At the least you'll have to refactor them to use the new implementation. I would follow the TDD method, though, of taking one test, reworking it for the new class, then writing just the code to pass that test. This may take even more discipline, though, since you already have one implementation under your belt and will undoubtedly be influenced by code you have already written.
This doesn't seem right because I'm
not using TDD to develop the second
two implementation classes.
Sure you are.
Start by commenting out all the tests but one. As you make a test pass either refactor or uncomment another test.
Jtf

Does generated code need to be human readable?

I'm working on a tool that will generate the source code for an interface and a couple classes implementing that interface. My output isn't particularly complicated, so it's not going to be hard to make the output conform to our normal code formatting standards.
But this got me thinking: how human-readable does auto-generated code need to be? When should extra effort be expended to make sure the generated code is easily read and understood by a human?
In my case, the classes I'm generating are essentially just containers for some data related to another part of the build with methods to get the data. No one should ever need to look at the code for the classes themselves, they just need to call the various getters the classes provide. So, it's probably not too important if the code is "clean", well formatted and easily read by a human.
However, what happens if you're generating code that has more than a small amount of simple logic in it?
I think it's just as important for generated code to be readable and follow normal coding styles. At some point, someone is either going to need to debug the code or otherwise see what is happening "behind the scenes".
Yes!, absolutely!; I can even throw in a story for you to explain why it is important that a human can easily read the auto generated code...
I once got the opportunity to work on a new project. Now, one of the first things you need to do when you start writing code is to create some sort of connection and data representation to and from the database. But instead of just writing this code by hand, we had someone who had developed his own code generator to automatically build base classes from a database schema. It was really neat, the tedious job of writing all this code was now out of our hands... The only problem was, the generated code was far from readable for a normal human.
Of course we didn't about that, because hey, it just saved us a lot of work.
But after a while things started to go wrong, data was incorrectly read from the user input (or so we thought), corruptions occurred inside the database while we where only reading. Strange.. because reading doesn't change any data (again, so we thought)...
Like any good developer we started to question our own code, but after days of searching.. even rewriting code, we could not find anything... and then it dawned on us, the auto generated code was broken!
So now an even bigger task awaited us, checking auto generated code that no sane person could understand in a reasonable amount of time... I'm talking about non indented, really bad style code with unpronounceable variable and function names... It turned out that it would even be faster to rewrite the code ourselves, instead of trying to figure out how the code actually worked.
Eventually the developer who wrote the code generator remade it later on, so it now produces readable code, in case something went wrong like before.
Here is a link I just found about the topic at hand; I was acctually looking for a link to one of the chapters from the "pragmatic programmer" book to point out why we looked in our code first.
I think that depends on how the generated code will be used. If the code is not meant to be read by humans, i.e. it's regenerated whenever something changes, I don't think it has to be readable. However, if you are using code generation as an intermediate step in "normal" programming, the generated could should have the same readability as the rest of your source code.
In fact, making the generated code "unreadable" can be an advantage, because it will discourage people from "hacking" generated code, and rather implement their changes in the code-generator instead—which is very useful whenever you need to regenerate the code for whatever reason and not lose the changes your colleague did because he thought the generated code was "finished".
Yes it does.
Firstly, you might need to debug it -- you will be making it easy on yourself.
Secondly it should adhere to any coding conventions you use in your shop because someday the code might need to be changed by hand and thus become human code. This scenario typically ensues when your code generation tool does not cover one specific thing you need and it is not deemed worthwhile modifying the tool just for that purpose.
Look up active code generation vs. passive code generation. With respect to passive code generation, absolutely yes, always. With regards to active code generation, when the code achieves the goal of being transparent, which is acting exactly like a documented API, then no.
I would say that it is imperative that the code is human readable, unless your code-gen tool has an excellent debugger you (or unfortunate co-worker) will probably by the one waist deep in the code trying to track that oh so elusive bug in the system. My own excursion into 'code from UML' left a bitter tast in my mouth as I could not get to grips with the supposedly 'fancy' debugging process.
The whole point of generated code is to do something "complex" that is easier defined in some higher level language. Due to it being generated, the actual maintenance of this generated code should be within the subroutine that generates the code, not the generated code.
Therefor, human readability should have a lower priority; things like runtime speed or functionality are far more important. This is particularly the case when you look at tools like bison and flex, which use the generated code to pre-generate speedy lookup tables to do pattern matching, which would simply be insane to manually maintain.
You will kill yourself if you have to debug your own generated code. Don't start thinking you won't. Keep in mind that when you trust your code to generate code then you've already introduced two errors into the system - You've inserted yourself twice.
There is absolutely NO reason NOT to make it human parseable, so why in the world would you want to do so?
-Adam
One more aspect of the problem which was not mentioned is that the generated code should also be "version control-friendly" (as far as it is feasible).
I found it useful many times to double-check diffs in generated code vs the source code.
That way you could even occasionally find bugs in tools which generate code.
It's quite possible that somebody in the future will want to go through and see what your code does. So making it somewhat understandable is a good thing.
You also might want to include at the top of each generated file a comment saying how and why this file was generated and what it's purpose is.
Generally, if you're generating code that needs to be human-modified later, it needs to be as human-readable as possible. However, even if it's code that will be generated and never touched again, it still needs to be readable enough that you (as the developer writing the code generator) can debug the generator - if your generator spits out bad code, it may be hard to track down if it's difficult to understand.
I would think it's worth it to take the extra time to make it human readable just to make it easier to debug.
Generated code should be readable, (format etc can usually be handled by a half decent IDE). At some stage in the codes lifetime it is going to be viewed by someone and they will want to make sense of it.
I think for data containers or objects with very straightforward workings, human readability is not very important.
However, as soon as a developer may have to read the code to understand how something happens, it needs to be readable. What if the logic has a bug? How will anybody ever discover it if no one is able to read and understand the code? I would go so far as generating comments for the more complicated logic sections, to express the intent, so it's easier to determine if there really is a bug.
Logic should always be readable. If someone else is going to read the code, try to put yourself in their place and see if you would fully understand the code in high (and low?) level without reading that particular piece of code.
I wouldn't spend too much time with code that never would be read, but if it's not too much time i would go through the generated code. If not, at least make comment to cover the loss of readability.
If this code is likely to be debugged, then you should seriously consider to generate it in a human readable format.
There are different types of generated code, but the most simple types would be:
Generated code that is not meant to be seen by the developer. e.g., xml-ish code that defines layouts (think .frm files, or the horrible files generated by SSIS)
Generated code that is meant to be a basis for a class that will be later customized by your developer, e.g., code is generated to reduce typing tedium
If you're making the latter, you definitely want your code to be human readable.
Classes and interfaces, no matter how "off limits" to developers you think they should be, would almost certainly fall under generated code type number 2. They will be hit by the debugger at one point of another -- applying code formatting is the least you can do the ease that debugging process when the compiler hits those generated classes
Like virtually everybody else here, I say make it readable. It costs nothing extra in your generation process and you (or your successor) will appreciate it when they go digging.
For a real world example - look at anything Visual Studio generates. Well formatted, with comments and everything.
Generated code is code, and there's no reason any code shouldn't be readable and nicely formatted. This is cheap especially in generated code: you don't need to apply formatting yourself, the generator does it for you everytime! :)
As a secondary option in case you're really that lazy, how about piping the code through a beautifier utility of your choice before writing it to disk to ensure at least some level of consistency. Nevertheless, almost all good programmers I know format their code rather pedantically and there's a good reason for it: there's no write-only code.
Absolutely yes for tons of good reasons already said above. And one more is that if your code need to be checked by an assesor (for safety and dependability issues), it is pretty better if the code is human redeable. If not, the assessor will refuse to assess it and your project will be refected by authorities. The only solution is then to assess... the code generator (that's usually much more difficult ;))
It depends on whether the code will only be read by a compiler or also by a human. In addition, it matters whether the code is supposed to be super-fast or whether readability is important. When in doubt, put in the extra effort to generate readable code.
I think the answer is: it depends.
*It depends upon whether you need to configure and store the generated code as an artefact. For example, people very rarely keep or configure the object code output from a c-compiler, because they know they can reproduce it from the source every time. I think there may be a similar analogy here.
*It depends upon whether you need to certify the code to some standard, e.g. Misra-C or DO178.
*It depends upon whether the source will be generated via your tool every time the code is compiled, or if it will you be stored for inclusion in a build at a later time.
Personally, if all you want to do is build the code, compile it into an executable and then throw the intermediate code away, then I can't see any point in making it too pretty.