What are the actual advantages of the visitor pattern? What are the alternatives? - language-agnostic

I read quite a lot about the visitor pattern and its supposed advantages. To me however it seems they are not that much advantages when applied in practice:
"Convenient" and "elegant" seems to mean lots and lots of boilerplate code
Therefore, the code is hard to follow. Also 'accept'/'visit' is not very descriptive
Even uglier boilerplate code if your programming language has no method overloading (i.e. Vala)
You cannot in general add new operations to an existing type hierarchy without modification of all classes, since you need new 'accept'/'visit' methods everywhere as soon as you need an operation with different parameters and/or return value (changes to classes all over the place is one thing this design pattern was supposed to avoid!?)
Adding a new type to the type hierarchy requires changes to all visitors. Also, your visitors cannot simply ignore a type - you need to create an empty visit method (boilerplate again)
It all just seems to be an awful lot of work when all you want to do is actually:
// Pseudocode
int SomeOperation(ISomeAbstractThing obj) {
switch (type of obj) {
case Foo: // do Foo-specific stuff here
case Bar: // do Bar-specific stuff here
case Baz: // do Baz-specific stuff here
default: return 0; // do some sensible default if type unknown or if we don't care
}
}
The only real advantage I see (which btw i haven't seen mentioned anywhere): The visitor pattern is probably the fastest method to implement the above code snippet in terms of cpu time (if you don't have a language with double dispatch or efficient type comparison in the fashion of the pseudocode above).
Questions:
So, what advantages of the visitor pattern have I missed?
What alternative concepts/data structures could be used to make the above fictional code sample run equally fast?

For as far as I have seen so far there are two uses / benefits for the visitor design pattern:
Double dispatch
Separate data structures from the operations on them
Double dispatch
Let's say you have a Vehicle class and a VehicleWasher class. The VehicleWasher has a Wash(Vehicle) method:
VehicleWasher
Wash(Vehicle)
Vehicle
Additionally we also have specific vehicles like a car and in the future we'll also have other specific vehicles. For this we have a Car class but also a specific CarWasher class that has an operation specific to washing cars (pseudo code):
CarWasher : VehicleWasher
Wash(Car)
Car : Vehicle
Then consider the following client code to wash a specific vehicle (notice that x and washer are declared using their base type because the instances might be dynamically created based on user input or external configuration values; in this example they are simply created with a new operator though):
Vehicle x = new Car();
VehicleWasher washer = new CarWasher();
washer.Wash(x);
Many languages use single dispatch to call the appropriate function. Single dispatch means that during runtime only a single value is taken into account when determining which method to call. Therefore only the actual type of washer we'll be considered. The actual type of x isn't taken into account. The last line of code will therefore invoke CarWasher.Wash(Vehicle) and NOT CarWasher.Wash(Car).
If you use a language that does not support multiple dispatch and you do need it (I can honoustly say I have never encountered such a situation though) then you can use the visitor design pattern to enable this. For this two things need to be done. First of all add an Accept method to the Vehicle class (the visitee) that accepts a VehicleWasher as a visitor and then call its operation Wash:
Accept(VehicleWasher washer)
washer.Wash(this);
The second thing is to modify the calling code and replace the washer.Wash(x); line with the following:
x.Accept(washer);
Now for the call to the Accept method the actual type of x is considered (and only that of x since we are assuming to be using a single dispatch language). In the implementation of the Accept method the Wash method is called on the washer object (the visitor). For this the actual type of the washer is considered and this will invoke CarWasher.Wash(Car). By combining two single dispatches a double dispatch is implemented.
Now to eleborate on your remark of the terms like Accept and Visit and Visitor being very unspecific. That is absolutely true. But it is for a reason.
Consider the requirement in this example to implement a new class that is able to repair vehicles: a VehicleRepairer. This class can only be used as a visitor in this example if it would inherit from VehicleWasher and have its repair logic inside a Wash method. But that ofcourse doesn't make any sense and would be confusing. So I totally agree that design patterns tend to have very vague and unspecific naming but it does make them applicable to many situations. The more specific your naming is, the more restrictive it can be.
Your switch statement only considers one type which is actually a manual way of single dispatch. Applying the visitor design pattern in the above way will provide double dispatch.
This way you do not necessarily need additional Visit methods when adding additional types to your hierarchy. Ofcourse it does add some complexity as it makes the code less readable. But ofcourse all patterns come at a price.
Ofcourse this pattern cannot always be used. If you expect lots of complex operations with multiple parameters then this will not be a good option.
An alternative is to use a language that does support multiple dispatch. For instance .NET did not support it until version 4.0 which introduced the dynamic keyword. Then in C# you can do the following:
washer.Wash((dynamic)x);
Because x is then converted to a dynamic type its actual type will be considered for the dispatch and so both x and washer will be used to select the correct method so that CarWasher.Wash(Car) will be called (making the code work correctly and staying intuitive).
Separate data structures and operations
The other benefit and requirement is that it can separate the data structures from the operations. This can be an advantage because it allows new visitors to be added that have there own operations while it also allows data structures to be added that 'inherit' these operations. It can however be only applied if this seperation can be done / makes sense. The classes that perform the operations (the visitors) do not know the structure of the data structures nor do they have to know that which makes code more maintainable and reusable. When applied for this reason the visitors have operations for the different elements in the data structures.
Say you have different data structures and they all consist of elements of class Item. The structures can be lists, stacks, trees, queues etc.
You can then implement visitors that in this case will have the following method:
Visit(Item)
The data structures need to accept visitors and then call the Visit method for each Item.
This way you can implement all kinds of visitors and you can still add new data structures as long as they consist of elements of type Item.
For more specific data structures with additional elements (e.g. a Node) you might consider a specific visitor (NodeVisitor) that inherits from your conventional Visitor and have your new data structures accept that visitor (Accept(NodeVisitor)). The new visitors can be used for the new data structures but also for the old data structures due to inheritence and so you do not need to modify your existing 'interface' (the super class in this case).

In my personal opinion, the visitor pattern is only useful if the interface you want implemented is rather static and doesn't change a lot, while you want to give anyone a chance to implement their own functionality.
Note that you can avoid changing everything every time you add a new method by creating a new interface instead of modifying the old one - then you just have to have some logic handling the case when the visitor doesn't implement all the interfaces.
Basically, the benefit is that it allows you to choose the correct method to call at runtime, rather than at compile time - and the available methods are actually extensible.
For more info, have a look at this article - http://rgomes-info.blogspot.co.uk/2013/01/a-better-implementation-of-visitor.html

By experience, I would say that "Adding a new type to the type hierarchy requires changes to all visitors" is an advantage. Because it definitely forces you to consider the new type added in ALL places where you did some type-specific stuff. It prevents you from forgetting one....

This is an old question but i would like to answer.
The visitor pattern is useful mostly when you have a composite pattern in place in which you build a tree of objects and such tree arrangement is unpredictable.
Type checking may be one thing that a visitor can do, but say you want to build an expression based on a tree that can vary its form according to a user input or something like that, a visitor would be an effective way for you to validate the tree, or build a complex object according to the items found on the tree.
The visitor may also carry an object that does something on each node it may find on that tree. this visitor may be a composite itself chaining lots of operations on each node, or it can carry a mediator object to mediate operations or dispatch events on each node.
You imagination is the limit of all this. you can filter a collection, build an abstract syntax tree out of an complete tree, parse a string, validate a collection of things, etc.

Related

Multiple logics (N number of clients) handled with only one function. All call the same function, HOW TO? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a corporate application written in python/Django (no python experience required to answer this). Its SAAS basically.
Few of the clients seems to have different requirement for a few modules.
Lets say there is a URL
www.xyz.com/groups
which is used by all clients but a few of the clients want to have different output on call of the same URL.
I want to know how can i do that without writing new function for each client or writing conditions in a single function.
Its a silly question i know but there must be some solution to it, i know.
If your code is required to do "A" for case "a" and "B" for case "b" and "C" for case "c", then regardless of what solution you pick, somewhere in the code has to exists something that decides whever or not case 'a/b/c' occurs, and something must exist that will dispatch correct 'A/B/C' action for that case, and of course those A/B/C actions have to be written somewhere in the code, too.
Step outside of the code and think about it. If it is specified and must happen - it has to be coded somewhere. You cannot escape that. Now, if the cases/actions are trivial and typical, you might find some more-or-even-more configurable library that accidentally allows you to configure such cases and actions, and off you go, you have it "with no code" and no "clutter". But looking formally, the code is deep there in the library. So, the decider, dispatcher and actions are coded. Just not by you - by someone that guessed your needs.
But if your needs are nontrivial and highly specific, for example, if it require your various conditions to decide which a/b/c case is it - then most probably you will have to code the 'decider' part for yourself. That means lots of tree-of-IFs, nested-switches, rules-n-loops, or whatever you like or feel adequate. After this, you are left with dispatch/execute phase, and this can be realized in a multitude of ways - i.e. strategy pattern - it is exactly it: dispatch (by concrete class related to case) and execute (the concrete strategy has the concrete code for the case).
Let's try something-like-OO approach:
For example, if you have cases a/b/c for UserTypes U1,U2,U3, you could introduce three classes:
UserType1 inherits from abstract UserType or implements "DoAThing" interface
UserType2 inherits from abstract UserType or implements "DoAThing" interface
UserType3 inherits from abstract UserType or implements "DoAThing" interface
UserType1 implements virtual method 'doTheThing' that executes actionA
UserType2 implements virtual method 'doTheThing' that executes actionB
UserType3 implements virtual method 'doTheThing' that executes actionC
your Users stop keeping "UserType" of type "int" equal to '1/2/3' - now their type is an object: UserType1, UserType2 or UserType3
whenever you must do the thing for a given user, you now just:
result = user.getType().doTheThing( ..params..)
So, instead of iffing/switching, you use OO: tell, don't ask rule. If the action-to-do is dependent solely on UserType, then let the UserType perform it. The resulting code is as short as possible - but at the cost of number of classes to create and, well, ...
... the decider, dispatcher and actions are still in the code. Actions - obvious - in the various usertype clasess. Dispatch - obvious - virtual call by common abstract base method. And decider..? Well: someone at some point had to choose and construct the correct UserType object for the user. If user was stored in the database, if "usertype" is just an integer 1/2/3, then somewhere in your ORM layer those 1/2/3 numbers had to be decoded and translated into UserType1/2/3 classes/objects. That means, that you'd need there a tree-of-ifs or a switch or etc. Or, if you have an ORM smart enough - you just set up a bunch of rules and it did it for you, but that's just again delegating part of the job to more-or-even-more configurable library. Not mentioning that your UserType1/2/3 classes in fact became somewhat .. strategies.
Ok, let's attack the 'choose' part.
You can build a tree of ifs or switches somewhere to decide and assign, but imperative seems to smell. Or, with OO, you can try to polymorphize something so that "it will just do the right thing", but it will not solve anything since again you will have to choose the object type somewhere. So, let's try data-driven: let's use lookups.
we've got five implementations of an action
create a hash/dictionary/map
add usertype1->caseA to the map
add usertype2->caseC to the map
add usertype3->caseB to the map
add usertype4->caseA to the map
add usertype5->caseE to the map
....
now, whenever you have a user and need to decide, just look it up. Instead of a "case" you may hold a ready to use object of a strategy. Or a callable method. Or a typename. Or whatever you need. The point is that instead of writing
if( user.type == 1) { ... }
else if( user.type == 2) ...
or switching, you just look it up:
thing = map[ user.type ]
if ( thing is null ) ???
but, mind that without some care, you might sometimes NOT find a match in the map. And also, the map must be PREDEFINED for ALL CASES. So, simple if X < 100 may turn up into a hundred of entries 0..99 inside the map.
Of course, instead of a map, you may use some rule-engine and you could define a mapping like
X<100 -> caseA
X>=100 -> caseB
and then 'run' the rules against your usertype and obtain a 'result' that will tell you "caseA".
And so on.
Each of the parts - decide, dispatch, execute - you may implement in various ways, shorter or longer, more or less extensible, more or less configurable, as OO/imperative/datadriven/functional/etc - but you cannot escape them:
you have to define the discriminant of the cases
you have to define the implementation of the actions
you have to define the mapping case-to-action
How to do them, is a matter of your aesthetics, language features, frameworks, libraries and .. time you want to spend on creating and mantaining it.

Is having a single massive class for all data storage OK?

I have created a class that I've been using as the storage for all listings in my applications. The class allows me to "sign" an object to a listing (which can be created on the fly via the sign() method like so):
manager.sign(myObject, "someList");
This stores the index of the element (using it's unique id) in the newly created or previously created listing "someList" as well as the object in a 2D array. So for example, I might end up with this:
trace(_indexes["someList"][objectId]); // 0 - the object is the first in this list
trace(_instances["someList"]); // [object MyObject]
The class has another two methods:
find(signature:String):Array
This method returns an array via slice() containing all of the elements signed with the given signature.
findFirst(signature:String):Object
This method just returns the first object in a given listing
So to retrieve myObject I can either go:
trace(find("someList")[0]); or trace(findFirst("someList"));
Finally, there is an unsign() function which will remove an object from a given listing. This function basically:
Stores the result of pop() in the specified listing against a variable.
Uses the stored index to quickly replace the specified object with the pop()'d item.
Deletes the stored index for the specified object and updates the index for the pop()'d item.
Through all this, using unsign() will remove an object extremely quickly from a listing of any size.
Now this is all well and good, but I've had some thoughts which are making me consider how good this really is? I mean being able to easily list, remove and access lists of anything I want throughout the application like this is awesome - but is there a catch?
A couple of starting thoughts I have had are:
So far I haven't implemented support for listings that are private and only accessible via a given class.
Memory - this doesn't seem very memory efficient. Then again, neither is creating arrays for everything I want to store individually either. Just seems.. Larger.. Somehow.
Any insights?
I've uploaded the class here in case the above doesn't make much sense: https://projectavian.com/AviManager.as
Your solution seems pretty solid. If you're looking to modify it to be a bit more extensible and handle rights management, you might consider moving all those individually indexed properties to a value object for your AV elements. You could perform operations like "sign" and "unsign" internally in the VOs, or check for access rights. Your management class could monitor the collection of these VOs, pass them around, perform the method calls, and the objects would hold the state in a bit more readable format.
Really, though, this is entering into a coding style discussion. Your method works and it's not particularly inefficient. Just make sure the code is readable, encapsulated, and extensible and you're good.

Naming conventions for methods which must be called in a specific order?

I have a class that requires some of its methods to be called in a specific order. If these methods are called out of order then the object will stop working correctly. There are a few asserts in the methods to ensure that the object is in a valid state. What naming conventions could I use to communicate to the next person to read the code that these methods need to be called in a specific order?
It would be possible to turn this into one huge method, but huge methods are a great way to create problems. (There are a 2 methods than can trigger this sequence so 1 huge method would also result in duplication.)
It would be possible to write comments that explain that the methods need to be called in order but comments are less useful then clearly named methods.
Any suggestions?
Is it possible to refactor so (at least some of) the state from the first function is passed as a paramter to the second function, then it's impossible to avoid?
Otherwise, if you have comments and asserts, you're doing quite well.
However, "It would be possible to turn this into one huge method" makes it sound like the outside code doesn't need to access the intermediate state in any way. If so, why not just make one public method, which calls several private methods successively? Something like:
FroblicateWeazel() {
// Need to be in this order:
FroblicateWeazel_Init();
FroblicateWeazel_PerformCals();
FroblicateWeazel_OutputCalcs();
FroblicateWeazel_Cleanup();
}
That's not perfect, but if the order is centralised to that one function, it's fairly easy to see what order they should come in.
Message digest and encryption/decryption routines often have an _init() method to set things up, an _update() to add new data, and a _final() to return final results and tear things back down again.

OOP Design Question - Linking to precursors

I'm writing a program to do a search and export the output.
I have three primary objects:
Request
SearchResults
ExportOutput
Each of these objects links to its precursor.
Ie: ExportOutput -> SearchResults -> Request
Is this ok? Should they somehow be more loosely coupled?
Clarification:
Processes later on do use properties and methods on the precursor objects.
Ie:
SendEmail(output.SearchResults.Request.UserEmail, BODY, SUBJECT);
This has a smell even to me. The only way I can think to fix it is have hiding properties in each one, that way I'm only accessing one level
MailAddress UserEmail
{
get { return SearchResults.UserEmail; }
}
which would yeild
SendEmail(output.UserEmail, BODY, SUBJECT);
But again, that's just hiding the problem.
I could copy everything out of the precursor objects into their successors, but that would make ExportOutput really ugly. Is their a better way to factor these objects.
Note: SearchResults implements IDisposable because it links to unmanaged resources (temp files), so I really don't want to just duplicate that in ExportOutput.
If A uses B directly, you cannot:
Reuse A without also reusing B
Test A in isolation from B
Change B without risking breaking A
If instead you designed/programmed to interfaces, you could:
Reuse A without also reusing B - you just need to provide something that implements the same interface as B
Test A in isolation from B - you just need to substitute a Mock Object.
Change B without risking breaking A - because A depends on an interface - not on B
So, at a minimum, I recommend extracting interfaces. Also, this might be a good read for you: the Dependency Inversion Principle (PDF file).
Without knowing your specifics, I would think that results in whatever form would simply be returned from a Request's method (might be more than one such method from a configured Request, like find_first_instance vs. find_all_instances). Then, an Exporter's output method(s) would take results as input. So, I am not envisioning the need to link the objects at all.

Transforming an object implicitly

The following code illustrates a pattern I sometimes see, whereby an object is transformed implicitly as it is passed as a parameter across a number of method calls.
var o = new MyReferenceType();
DoSomeWorkAndPossiblyModifyO(o);
DoYetMoreWorkAndPossiblyFurtherModifyO(o);
//now use o...
This feels wrong to me (it hardly feels object oriented). Is it acceptable?
Based on your method names, I would argue that there is nothing implicit in the transformation. This pattern would be acceptable. If, on the other hand your methods had names like printO(o) or compareTo(o), but actually modified the Object o, the design would be bad.
It is acceptable but usually bad style.
The usual "good" approach is:
DoSomeWorkAndModify(&o); // explicit reference means we will be accepting changes
o = DoSomeWorkAndReturnModified(o); // much more elastic because you often want to keep original.
The approach you presented makes sense when o is huge, and making a copy of it in memory is out of question, or if it's a function you (and nobody else = private) use very frequently and don't want to bother with the & syntax. Otherwise it's laziness that results in some really difficult to detect bugs.
It depends entirely on what the methods actually do, besides modifying that object.
For instance, an object primarily related to keeping some state in memory might for instance not have anything related to persisting that state anywhere.
The methods could for instance load data from a database, and update the object with that information.
However! Since I program mostly in C# and thus .NET, which is a wholly object-oriented language, I would actually write your code like this:
var o = new MyReferenceType();
SomeOtherClass.DoSomeWorkAndPossiblyModifyO(o);
SomeOtherClass.DoYetMoreWorkAndPossiblyFurtherModifyO(o);
//now use o...
In which case the actual name of that other class (or those other classes if there's 2 involved) would give me a big clue as to what is actually happening and/or the context.
Example:
Person p = new Person();
DatabaseContext.FetchAllLazilyLoadedProperties(p);
DatabaseContext.Save(p); // updates primary key property with new ID