What does "to stub" mean in programming? - terminology

For example, what does it mean in this quote?
Integrating with an external API is almost a guarantee in any modern web app. To effectively test such integration, you need to stub it out. A good stub should be easy to create and consistently up-to-date with actual, current API responses. In this post, we’ll outline a testing strategy using stubs for an external API.

A stub is a controllable replacement for an Existing Dependency (or collaborator)
in the system. By using a stub, you can test your code without
dealing with the dependency directly.
External Dependency - Existing Dependency:
It is an object in your system that your code
under test interacts with and over which you have no control. (Common
examples are filesystems, threads, memory, time, and so on.)
Forexample in below code:
public void Analyze(string filename)
{
if(filename.Length>8)
{
try
{
errorService.LogError("long file entered named:" + filename);
}
catch (Exception e)
{
mailService.SendEMail("admin#hotmail.com", "ErrorOnWebService", "someerror");
}
}
}
You want to test mailService.SendEMail() method, but to do that you need to simulate an Exception in your test method, so you just need to create a Fake Stub errorService object to simulate the result you want, then your test code will be able to test mailService.SendEMail() method. As you see you need to simulate a result which is from an another Dependency which is ErrorService class object (Existing Dependency object).

A stub, in this context, means a mock implementation.
That is, a simple, fake implementation that conforms to the interface and is to be used for testing.

Layman's terms, it's dummy data (or fake data, test data...etc.) that you can use to test or develop your code against until you (or the other party) is ready to present/receive real data. It's a programmer's "Lorem Ipsum".
Employee database not ready? Make up a simple one with Jane Doe, John Doe...etc.
API not ready? Make up a fake one by creating a static .json file containing fake data.

In this context, the word "stub" is used in place of "mock", but for the sake of clarity and precision, the author should have used "mock", because "mock" is a sort of stub, but for testing. To avoid further confusion, we need to define what a stub is.
In the general context, a stub is a piece of program (typically a function or an object) that encapsulates the complexity of invoking another program (usually located on another machine, VM, or process - but not always, it can also be a local object). Because the actual program to invoke is usually not located on the same memory space, invoking it requires many operations such as addressing, performing the actual remote invocation, marshalling/serializing the data/arguments to be passed (and same with the potential result), maybe even dealing with authentication/security, and so on. Note that in some contexts, stubs are also called proxies (such as dynamic proxies in Java).
A mock is a very specific and restrictive kind of stub, because a mock is a replacement of another function or object for testing. In practice we often use mocks as local programs (functions or objects) to replace a remote program in the test environment. In any case, the mock may simulate the actual behaviour of the replaced program in a restricted context.
Most famous kinds of stubs are obviously for distributed programming, when needing to invoke remote procedures (RPC) or remote objects (RMI, CORBA). Most distributed programming frameworks/libraries automate the generation of stubs so that you don't have to write them manually. Stubs can be generated from an interface definition, written with IDL for instance (but you can also use any language to define interfaces).
Typically, in RPC, RMI, CORBA, and so on, one distinguishes client-side stubs, which mostly take care of marshaling/serializing the arguments and performing the remote invocation, and server-side stubs, which mostly take care of unmarshaling/deserializing the arguments and actually execute the remote function/method. Obviously, client stubs are located on the client side, while sever stubs (often called skeletons) are located on the server side.
Writing good efficient and generic stubs becomes quite challenging when dealing with object references. Most distributed object frameworks such as RMI and CORBA deal with distributed objects references, but that's something most programmers avoid in REST environments for instance. Typically, in REST environments, JavaScript programmers make simple stub functions to encapsulate the AJAX invocations (object serialization being supported by JSON.parse and JSON.stringify). The Swagger Codegen project provides an extensive support for automatically generating REST stubs in various languages.

Stub is a function definition that has correct function name, the correct number of parameters and produces dummy result of the correct type.
It helps to write the test and serves as a kind of scaffolding to make it possible to run the examples even before the function design is complete

This phrase is almost certainly an analogy with a phase in house construction — "stubbing out" plumbing. During construction, while the walls are still open, the rough plumbing is put in. This is necessary for the construction to continue. Then, when everything around it is ready enough, one comes back and adds faucets and toilets and the actual end-product stuff. (See for example How to Install a Plumbing Stub-Out.)
When you "stub out" a function in programming, you build enough of it to work around (for testing or for writing other code). Then, you come back later and replace it with the full implementation.

You have also a very good testing frameworks to create such a stub.
One of my preferrable is Mockito There is also EasyMock and others... But Mockito is great you should read it - very elegant and powerfull package

RPC Stubs
Basically, a client-side stub is a procedure that looks to the client as if it were a callable server procedure.
A server-side stub looks to the server as if it's a calling client.
The client program thinks it is calling the server; in fact, it's calling the client stub.
The server program thinks it's called by the client; in fact, it's called by the server stub.
The stubs send messages to each other to make the RPC happen.
Source

"Stubbing-out a function means you'll write only enough to show that the function was called, leaving the details for later when you have more time."
From: SAMS Teach yourself C++, Jesse Liberty and Bradley Jones

Stub is a piece of code which converts the parameters during RPC (Remote procedure call).The parameters of RPC have to be converted because both client and server use different address space. Stub performs this conversion so that server perceive the RPC as a local function call.

A stub can be said as a fake substitute of the original function, which gives output, which is not accessible right now because of reasons:
It is not developed right now
It is not callable from the current environment (maybe testing)
A stub has :
Exact number of parameters
Exact output format (not necessarily correct output)
Why a stub is used?
When function is not accessible in an environment such as testing, or when its implementation is not available.
Example:
let's say you want to test a function in which there's a network call. While testing the code, you cannot wait for a network call's result for your test. so you write a mock output of the network call and proceed with your test.
TestFunction(){
// Some things here
// Some things here
var result = networkCall(param)
// something depending on the result
}
This networkCall gives out lets say a string, so you have to create a function with exact same parameters and it should give string output.
String fakeNetworkCall(int param){
if(param == 1) return "OK";
else return "NOT OK";
}
Now you have written a fake function, use this as replacement in your code
TestFunction(){
// Some things here
// Some things here
var result = fakeNetworkCall(param)
// something depending on the result
}
This fakeNetworkCall is a stub.

Related

NUnit equivalent for JUnit test state management with #Before/#After

I come from Java world and I mostly used JUnit, and now I have some problems expressing some aspects of tests with NUnit 3. In JUnit, each test creates its own instance of a test class, so it's perfectly valid to create some instance variables in a test class, set up them in #Before method, test method and helpers can access these variables freely without worrying they would be overwritten by other tests run in parallel, and #After tears down the test data nicely. With NUnit it does not work and SetUp and TearDown methods seem to be useless in this case, because test fixture instance is reused between invocations of test method(s), so fields of test fixture class can (and are) overwritten by every invocation of a test method (my class has a few test methods, and each of them generates several test cases, so there are some tens of invocations in one test run).
I do not know how to work around this problem. In my scenario, set up would create a temporary folder, which would be used as a work folder for following test case. Tear down would delete the temporary folder afterwards, cleaning up all intermediate files created by tested method. But now, when SetUp creates and stores a temporary folder path in instance field (so it can be read by test logic and somewhat complicated asserts and verifiers), the value of such field is overwritten by test cases run in parallel. I considered several approaches:
implement an IDisposable which would represent a context of each test, and enclose it with using in each test method - I do not like this idea, because I do not like the idea of IDisposable being used as anything else than resource management tool and combinig IDisposable with using to simulate set up/tear down smells to me like an abuse of this particular language feature,
create a method which accepts a delegate for actual test logic, and which invokes custom SetUpTestCase/TearDownTestCase methods. The method would invoke set up, then test delegate, and tear down afterwards. What I do not like about this approach is that it does not play well with test methods which accept parameters - each set of test methods parametrized in particular way would need a corresponding delegate type. Also it somewhat seems to be against spirit of NUnit and the way of describing test methods with attributes - after all, why should the main logic of my test be delegated to anything? Shouldn't the [Test] or [TestCase] method be actual test?
maybe there's some way to use more advanced aspects of NUnit, like actions or some callbacks/triggers/whatever, I am just too unexperienced to see these. What I particularly miss is the way to transfer data from set up method (for example, a path to a temporary folder created by it) to the test method that follows. I cannot use instance fields for this, and I do not know whether there exists any "tag" structure which would pass test-specific data between methods invoked on different stages of a test lifecycle?
Generally, SetUp and TearDown attributes seem pretty useless to me, if they cannot set up the test case without their result being overwritten immediately by another test case run in parallel. What am I missing here?
How can I implemented such per-test case, scoped setup/tear down behavior with NUnit? What do I do wrong, or what do I miss?
As you have established, the TestFixture class is instantiated once before the OneTimeSetUp is called; then for each test it runs a set of SetUp, Test and TearDown; and finally, the OneTimeTearDown.
If you want the tests to be run in parallel (which is not the default) then you must specify The Parallelizable Attribute. Whether you do that or not, it is a good idea for your tests to be written independently, so they do not conflict with each other - they need to be structured.
The AAA (Arrange, Act, Assert) pattern is a common way of structuring unit tests for a method under test. If your tests are to be run in parallel, then TestFixture fields are not suitable for holding information which may conflict across parallel tests, in the same way that it wouldn't be suitable in a multithreaded class.
I'd suggest using a private method in the TestFixture to set up the temporary folder - it will need to have some way of providing a unique folder name, so that the parallel tests do not interact - perhaps use a Guid or CallerMemberName as part of the folder name, and return the folder name.
This method should be called from the Arrange part of the test. And you'll need a try...finally wrapping the rest of the Test to ensure the folder gets torn down. Or you could go with your IDisposable idea - I don't think there's anything wrong with that: the whole point of that is to guarantee tidying up resources (both managed and unmanaged) when something goes out of scope.
Your second suggestion of a delegate would also be fine if you used lambda expressions rather than strictly-defined delegates - the lambda expression can capture variables from the containing scope.

Is assert in privation function redundant if check has already been made by the calling public function?

Effective java states a good practice of assertions in private methods.
"For an unexported method, you as the package author control the circumstances under which the method is called, so you can and should ensure that only valid parameter values are ever passed in. Therefore, nonpublic methods should generally check their parameters using assertions, as shown below:
For example:
// Private helper function for a recursive sort
private static void sort(long a[]) {
assert a != null;
// Do the computation;
}
My question is would asserts be required even if the public function calling the sort has a null pointer check ?
Example:
public void computeTwoNumbersThatSumToInputValue(int a[], int x) {
if (a == null) {
throw new Nullptrexception();
}
sort(a);
// code to do the required.
}
In other words, will asserts in private function be 'redudant' or mandatory in this case.
Thanks,
It's redundant if you're sure that you've got the assertion in all the calling code. In some cases, that's very obvious - in other cases it can be less so. If you're calling sort from 20 places in the class, are you sure you've checked it in every case?
It's a matter of taste and balance, with no "one size fits all" answer. The balance is in terms of code clarity (both ways!), performance (in extreme cases) and of course safety. It depends on the exact context, and I wouldn't personally like to even guarantee that I'm entirely consistent. (In other words, "level of caffeine at the time of coding" may turn out to be an influence too.)
Note that your assert is only going to execute when assertions are turned on anyway - I personally prefer to validate parameters consistently however you're running the code. I generally use the Preconditions class from Guava to make preconditions unobtrusive.
Assertions will make the helper function sort more robust to use.
Checking for parameters before passing it to any method is a good methodology to have more control over the Exceptions occurring unintentionally at the runtime.
My suggestion will be to use both the approaches in your code as there is no guarantee that all the callers of sort will do such checks. If assertions in helper methods are algorithmically of high order or seems redundant then this can be disabled (esp for production use) via use of -disableassertions or -da from command-line.
You could do that. I will quote from the Oracle docs.
An assertion is a statement in the JavaTM programming language that
enables you to test your assumptions about your program. For example,
if you write a method that calculates the speed of a particle, you
might assert that the calculated speed is less than the speed of
light.
I do not personally use assertions, but from what I gathered readings the oracle docs on it, it enables you to test your assumptions about what you expect something to do. Try/catch blocks are more for failing gracefully as an inevitability of failures bound to happen (like networking, computer problems). Basically, in a perfect world your code would always run successfully because theres nothing wrong with it code wise. But this isn't a perfect world. Also note:
Experience has shown that writing assertions while programming is one
of the quickest and most effective ways to detect and correct bugs. As
an added benefit, assertions serve to document the inner workings of
your program, enhancing maintainability.
I would say use as a preference. To answer your question, I would mainly use it to test code as the docs say, while testing assumptions you have about your code. As the second quote mentions, it has the added benefit of telling other developers (or future you) what you assume to get as parameters. As a personal preference, I leave control flow to try/catch blocks as that is what they were designed for.
*But keep in mind that assertions could be turned off.

Is it safe to call jcuda.driver.JCudaDriver/cuInit multiple times in a program?

I'm using a dynamic language (Clojure) to create CUDA contexts in a interactive development way using JCuda. Often I will call an initializer that includes the call to jcuda.driver.JCudaDriver/cuInit. Is it safe to call cuInit multiple times? In addition, is there something like a destroy method for cuInit? I ask since its possible for an error code CUDA_ERROR_DEINITIALIZED to be returned.
To answer the question, yes it is probably safe to call cuInit multiple times. I haven't noticed any side effects from doing so.
Note, however, that cuInit only triggers one-time initialisation processes inside the API. It doesn't do anything with devices, or contexts and it definitely can't return CUDA_ERROR_DEINITIALIZED. Doing the steps you would do after calling cuInit in an application (ie. creating a context) would have real implications - doing so creates a new context each time you call it and resource exhaustion will occur if contexts are not actively destroyed. There is no equivalent deinitialisation call for the API. I guess the intention is that once intialised, the runtime API is expected to stay in that state until an application terminates.

How should I refactor my code to remove unnecessary singletons?

I was confused when I first started to see anti-singleton commentary. I have used the singleton pattern in some recent projects, and it was working out beautifully. So much so, in fact, that I have used it many, many times.
Now, after running into some problems, reading this SO question, and especially this blog post, I understand the evil that I have brought into the world.
So: How do I go about removing singletons from existing code?
For example:
In a retail store management program, I used the MVC pattern. My Model objects describe the store, the user interface is the View, and I have a set of Controllers that act as liason between the two. Great. Except that I made the Store into a singleton (since the application only ever manages one store at a time), and I also made most of my Controller classes into singletons (one mainWindow, one menuBar, one productEditor...). Now, most of my Controller classes get access the other singletons like this:
Store managedStore = Store::getInstance();
managedStore.doSomething();
managedStore.doSomethingElse();
//etc.
Should I instead:
Create one instance of each object and pass references to every object that needs access to them?
Use globals?
Something else?
Globals would still be bad, but at least they wouldn't be pretending.
I see #1 quickly leading to horribly inflated constructor calls:
someVar = SomeControllerClass(managedStore, menuBar, editor, sasquatch, ...)
Has anyone else been through this yet? What is the OO way to give many individual classes acces to a common variable without it being a global or a singleton?
Dependency Injection is your friend.
Take a look at these posts on the excellent Google Testing Blog:
Singletons are pathologic liars (but you probably already understand this if you are asking this question)
A talk on Dependency Injection
Guide to Writing Testable Code
Hopefully someone has made a DI framework/container for the C++ world? Looks like Google has released a C++ Testing Framework and a C++ Mocking Framework, which might help you out.
It's not the Singleton-ness that is the problem. It's fine to have an object that there will only ever be one instance of. The problem is the global access. Your classes that use Store should receive a Store instance in the constructor (or have a Store property / data member that can be set) and they can all receive the same instance. Store can even keep logic within it to ensure that only one instance is ever created.
My way to avoid singletons derives from the idea that "application global" doesn't mean "VM global" (i.e. static). Therefore I introduce a ApplicationContext class which holds much former static singleton information that should be application global, like the configuration store. This context is passed into all structures. If you use any IOC container or service manager, you can use this to get access to the context.
There's nothing wrong with using a global or a singleton in your program. Don't let anyone get dogmatic on you about that kind of crap. Rules and patterns are nice rules of thumb. But in the end it's your project and you should make your own judgments about how to handle situations involving global data.
Unrestrained use of globals is bad news. But as long as you are diligent, they aren't going to kill your project. Some objects in a system deserve to be singleton. The standard input and outputs. Your log system. In a game, your graphics, sound, and input subsystems, as well as the database of game entities. In a GUI, your window and major panel components. Your configuration data, your plugin manager, your web server data. All these things are more or less inherently global to your application. I think your Store class would pass for it as well.
It's clear what the cost of using globals is. Any part of your application could be modifying it. Tracking down bugs is hard when every line of code is a suspect in the investigation.
But what about the cost of NOT using globals? Like everything else in programming, it's a trade off. If you avoid using globals, you end up having to pass those stateful objects as function parameters. Alternatively, you can pass them to a constructor and save them as a member variable. When you have multiple such objects, the situation worsens. You are now threading your state. In some cases, this isn't a problem. If you know only two or three functions need to handle that stateful Store object, it's the better solution.
But in practice, that's not always the case. If every part of your app touches your Store, you will be threading it to a dozen functions. On top of that, some of those functions may have complicated business logic. When you break that business logic up with helper functions, you have to -- thread your state some more! Say for instance you realize that a deeply nested function needs some configuration data from the Store object. Suddenly, you have to edit 3 or 4 function declarations to include that store parameter. Then you have to go back and add the store as an actual parameter to everywhere one of those functions is called. It may be that the only use a function has for a Store is to pass it to some subfunction that needs it.
Patterns are just rules of thumb. Do you always use your turn signals before making a lane change in your car? If you're the average person, you'll usually follow the rule, but if you are driving at 4am on an empty high way, who gives a crap, right? Sometimes it'll bite you in the butt, but that's a managed risk.
Regarding your inflated constructor call problem, you could introduce parameter classes or factory methods to leverage this problem for you.
A parameter class moves some of the parameter data to it's own class, e.g. like this:
var parameterClass1 = new MenuParameter(menuBar, editor);
var parameterClass2 = new StuffParameters(sasquatch, ...);
var ctrl = new MyControllerClass(managedStore, parameterClass1, parameterClass2);
It sort of just moves the problem elsewhere though. You might want to housekeep your constructor instead. Only keep parameters that are important when constructing/initiating the class in question and do the rest with getter/setter methods (or properties if you're doing .NET).
A factory method is a method that creates all instances you need of a class and have the benefit of encapsulating creation of the said objects. They are also quite easy to refactor towards from Singleton, because they're similar to getInstance methods that you see in Singleton patterns. Say we have the following non-threadsafe simple singleton example:
// The Rather Unfortunate Singleton Class
public class SingletonStore {
private static SingletonStore _singleton
= new MyUnfortunateSingleton();
private SingletonStore() {
// Do some privatised constructing in here...
}
public static SingletonStore getInstance() {
return _singleton;
}
// Some methods and stuff to be down here
}
// Usage:
// var singleInstanceOfStore = SingletonStore.getInstance();
It is easy to refactor this towards a factory method. The solution is to remove the static reference:
public class StoreWithFactory {
public StoreWithFactory() {
// If the constructor is private or public doesn't matter
// unless you do TDD, in which you need to have a public
// constructor to create the object so you can test it.
}
// The method returning an instance of Singleton is now a
// factory method.
public static StoreWithFactory getInstance() {
return new StoreWithFactory();
}
}
// Usage:
// var myStore = StoreWithFactory.getInstance();
Usage is still the same, but you're not bogged down with having a single instance. Naturally you would move this factory method to it's own class as the Store class shouldn't concern itself with creation of itself (and coincidentally follow the Single Responsibility Principle as an effect of moving the factory method out).
From here you have many choices, but I'll leave that as an exercise for yourself. It is easy to over-engineer (or overheat) on patterns here. My tip is to only apply a pattern when there is a need for it.
Okay, first of all, the "singletons are always evil" notion is wrong. You use a Singleton whenever you have a resource which won't or can't ever be duplicated. No problem.
That said, in your example, there's an obvious degree of freedom in the application: someone could come along and say "but I want two stores."
There are several solutions. The one that occurs first of all is to build a factory class; when you ask for a Store, it gives you one named with some universal name (eg, a URI.) Inside that store, you need to be sure that multiple copies don't step on one another, via critical regions or some method of ensuring atomicity of transactions.
Miško Hevery has a nice article series on testability, among other things the singleton, where he isn't only talking about the problems, but also how you might solve it (see 'Fixing the flaw').
I like to encourage the use of singletons where necessary while discouraging the use of the Singleton pattern. Note the difference in the case of the word. The singleton (lower case) is used wherever you only need one instance of something. It is created at the start of your program and is passed to the constructor of the classes that need it.
class Log
{
void logmessage(...)
{ // do some stuff
}
};
int main()
{
Log log;
// do some more stuff
}
class Database
{
Log &_log;
Database(Log &log) : _log(log) {}
void Open(...)
{
_log.logmessage(whatever);
}
};
Using a singleton gives all of the capabilities of the Singleton anti-pattern but it makes your code more easily extensible, and it makes it testable (in the sense of the word defined in the Google testing blog). For example, we may decide that we need the ability to log to a web-service at some times as well, using the singleton we can easily do that without significant changes to the code.
By comparison, the Singleton pattern is another name for a global variable. It is never used in production code.

Design question: How can I access an IPC mechanism transparently?

I want to do this (no particular language):
print(foo.objects.bookdb.books[12].title);
or this:
book = foo.objects.bookdb.book.new();
book.title = 'RPC for Dummies';
book.save();
Where foo actually is a service connected to my program via some IPC, and to access its methods and objects, some layer actually sends and receives messages over the network.
Now, I'm not really looking for an IPC mechanism, as there are plenty to choose from. It's likely not to be XML based, but rather s. th. like Google's protocol buffers, dbus or CORBA. What I'm unsure about is how to structure the application so I can access the IPC just like I would any object.
In other words, how can I have OOP that maps transparently over process boundaries?
Not that this is a design question and I'm still working at a pretty high level of the overall architecture. So I'm pretty agnostic yet about which language this is going to be in. C#, Java and Python are all likely to get used, though.
I think the way to do what you are requesting is to have all object communication regarded as message passing. This is how object methods are handled in ruby and smalltalk, among others.
With message passing (rather than method calling) as your object communication mechanism, then operations such as calling a method that didn't exist when you wrote the code becomes sensible as the object can do something sensible with the message anyway (check for a remote procedure, return a value for a field with the same name from a database, etc, or throw a 'method not found' exception, or anything else you could think of).
It's important to note that for languages that don't use this as a default mechanism, you can do message passing anyway (every object has a 'handleMessage' method) but you won't get the syntax niceties, and you won't be able to get IDE help without some extra effort on your part to get the IDE to parse your handleMessage method to check for valid inputs.
Read up on Java's RMI -- the introductory material shows how you can have a local definition of a remote object.
The trick is to have two classes with identical method signatures. The local version of the class is a facade over some network protocol. The remote version receives requests over the network and does the actual work of the object.
You can define a pair of classes so a client can have
foo= NonLocalFoo( "http://host:port" )
foo.this= "that"
foo.save()
And the server receives set_this() and save() method requests from a client connection. The server side is (generally) non-trivial because you have a bunch of discovery and instance management issues.
You shouldn't do it! It is very important for programmers to see and feel the difference between an IPC/RPC and a local method call in the code. If you make it so, that they don't have to think about it, they won't think about it, and that will lead to very poorly performing code.
Think of:
foreach o, o.isGreen in someList {
o.makeBlue;
}
The programmer assumes that the loops takes a few nanoseconds to complete, instead it takes close to a second if someList happens to be remote.