Communication between visitor and visitee

Communication between visitor and visitee - language-agnostic

My current project contains a complex object hierarchy. The following structure is a simplified example of this hierarchy for demonstration purposes:
Library
Category "Fiction"
Category "Science Fiction"
Book A (Each book contains pages, not displayed here)
Book B
Category "Crime"
Book C
Category "Non-fiction"
(Many subcategories)
Now, I want to avoid having nested loops all over my code whenever I need some information from the data structure, because when the structure changes I'd have to update all the loops.
So I plan on using the visitor pattern, which seems to give me the flexibility I need. It would look something like this:
class Library
{
void Accept(ILibraryVisitor visitor)
{
IterateCategories(this.categories, visitor);
}
void IterateCategories(
IEnumerable<Category> categorySequence,
ILibraryVisitor visitor)
{
foreach (var category in categorySequence)
{
visitor.VisitCategory(category.Name);
IterateCategories(category.Subcategories, visitor);
foreach (var book in category.Books)
{
// Could also pass in a book instance, not sure about that yet...
visitor.VisitBook(book.Title, book.Author, book.PublishingDate);
foreach (var page in book.Pages)
{
visitor.VisitPage(page.Number, page.Content);
}
}
}
}
}
interface ILibraryVisitor
{
void VisitCategory(string name);
void VisitBook(string title, string author, DateTime publishingDate);
void VisitPage(int pageNumber, string content);
}
I'm already seeing some possible problems though, so I'm hoping you can give me some advice.
Question 1
If I wanted to create a list of book titles prefixed by the (sub)categories it belongs to (e.g. Fiction » Science Fiction » Book A), a simple visitor implementation would appear to do the trick:
// LibraryVisitor is a base implementation with no-op methods
class BookListingVisitor : LibraryVisitor
{
private Stack<string> categoryStack = new Stack<string>();
void VisitCategory(string name)
{
this.categoryStack.Push(name);
}
// Other methods
}
Here I have already run into a problem: I have no clue on when to pop the stack, because I don't know when a category ends. Is it a common approach to split up the VisitCategory method into two methods, like below?
interface ILibraryVisitor
{
void VisitCategoryStart(string name);
void VisitCategoryEnd();
// Other methods
}
Or are there other ways of dealing with structures like this, which have a clear scope with a start and end?
Question 2
Suppose I only want to list the books that were published in 1982. A decorator visitor would separate the filtering from the listing logic:
class BooksPublishedIn1982 : LibraryVisitor
{
private ILibraryVisitor visitor;
public BooksPublishedIn1982(ILibraryVisitor visitor)
{
this.visitor = visitor;
}
void VisitBook(string title, string author, DateTime publishingDate)
{
if (publishingDate.Year == 1982)
{
this.visitor.VisitBook(string title, string author, publishingDate);
}
}
// Other methods that simply delegate to this.visitor
}
The problem here is that VisitPage will still be called for books that are not published in 1982. So the decorator somehow needs to communicate with the visited object:
Visitor: 'Hey, this book isn't from 1982, so please don't tell me anything about it.'
Library: 'Oh ok, then I won't show you its pages.'
The visit methods currently return void. I could change it to return a boolean which indicates whether to visit sub-items, but that feels kind of dirty. Are there common practices for letting the visitee know that it should skip certain items? Or perhaps I should look into a different design pattern?
P.S. If you think these should be two separate questions, just let me know and I'll be happy to split them up.

The Visitor pattern, as described by the GoF book, deals with class hierarchies and not with object hierarchies. To put it simply, adding a new Visitor type acts like adding a new virtual function to the base class and all the children, without touching their code.
The machinery of a Visitor consists of one Visitor::Visit function per class in the hierarchy, and the Accept function in the parent class and in all the descendants. It works by calling Accept(visitor) through a parent class reference. The implementation of Accept in the object that happens to be referenced calls the right kind of Visitor::Visit(this). It is all fully orthogonal to any object hierarchy that may exist between instances of different subclasses of our root class.
In your case, the ILibraryVisitor interface would have a VisitLibrary(Library) method, a VisitCategory(Category) method, a VisitBook(Book) method, and so on, while each of Library, Category, Book and so on would inherit a common base class and reimplement its Accept(ILibraryVisitor) method.
So far so good. But from this point on your implementation seems to get a bit disoriented. A Visitor does not call its own Visit functions! Members of the hierarchy do, Visitor implements these functions for their benefit. So how do we go down the category tree?
Remember that to call Accept(FooVisitor) replaces the method Foo in the root of the hierarchy, and FooVisitor::VisitBar replaces the implementation of bar::Foo . When we want to do something with an object, we call its methods. don't we? So let's do it (in pseudocode).
class LibraryVisitor : ILibraryVisitor
{
IterateChildren (List<ILibraryObject> objects) {
foreach obj in objects {
obj.Accept(this);
}
}
IterateSubcategories (Category cat) {
stack.push (cat); # we need a stack here to build a path
IterateChildren (cat.children); # both books and subcategories
stack.pop();
}
VisitLibrary (Library) = abstract
VisitCategory (Category) = abstract
VisitBook (page) = abstract
VisitPage (Page) = abstract
}
class MyLibraryVisitor : LibraryVisitor {
VisitLibrary (Library l ) { ... IterateChildren (categories) ... }
VisitCategory (Category c) = { ... IterateSubcategories (c) ... }
VisitBook (Book) = { ... IterateChildren (pages) ... }
VisitPage (Page) = { ... no children here, end of walk ... }
}
Note the ping-pong action between Visit and Accept. Visitor calls Accept on the children of the current visitee, the children call Visitor::Visit back, and Visitor calls Accept on their children etc.
This is how your second question is answered:
class BooksPublishedIn1982 : LibraryVisitor
{
VisitBook (Book b) {
if b.publishedIn (1982) {
IterateChildren(b.pages)
}
}
}
Once again, it is apparent that the tree walk and the visitor machinery have just about nothing to do with each other.
I have left the decision of iterating or not iterating children entirely with each Visit implementation. This need not be the case, you can easily split each VisitXYZ into two functions, VisitXYZProper and VisitXYZChildren. By default, VisitXYZ will call both and each concrete visitor may override that decision.

Related

Is "one method per class" overdoing the Single Responsibility Principle?

I'm building a simple todo list application on android because I want to get myself familiar with the clean architecture. I layered the application with domain, data and presentation layer, and here is the example i'm following: https://github.com/android10/Android-CleanArchitecture
When I tried to figure out what is the domain for this application, I asked myself "what is this application about?". To which I reply, "It is about letting user create a group and create tasks within that group", very simple,
So I created the following:
Adding groups to Room database
public class AddGroupItemUseCase extends AbstractUseCaseCompletable<AddGroupItemUseCase.Params> {
private final GroupItemRepository mGroupItemRepository;
public AddGroupItemUseCase(GroupItemRepository repository,
PostExecutionThread postExecutionThread,
ThreadExecution threadExecution) {
super(threadExecution, postExecutionThread);
mGroupItemRepository = repository;
}
#Override
public Completable buildUseCaseCompletable(Params params) {
return mGroupItemRepository.addItemToGroup(params.mItem);
}
public static final class Params {
private final Item mItem;
private Params(Item item) {
mItem = item;
}
public static Params itemToBeAdded(Item item) {
return new Params(item);
}
}
}
Adding tasks to a group in Room database:
public class AddGroupUseCase extends AbstractUseCaseCompletable<AddGroupUseCase.Params> {
private final GroupRepository mGroupRepository;
public AddGroupUseCase(ThreadExecution threadExecution,
PostExecutionThread postExecutionThread,
GroupRepository repository) {
super(threadExecution, postExecutionThread);
mGroupRepository = repository;
}
#Override
public Completable buildUseCaseCompletable(Params params) {
return mGroupRepository.addGroup(params.mGroup);
}
public static final class Params {
private final Group mGroup;
private Params(Group group) {
mGroup = group;
}
public static AddGroupUseCase.Params groupToAdd(Group group) {
return new AddGroupUseCase.Params(group);
}
}
}
So, an obvious question arises, do I have to create these one class one method classes for every crud operation? For example, what if I want to get know how many tasks are in a group? do I have to create a class with that method in order to comply with the clean architecture? feels like a lot of classes need to be created, but I guess it make sense because of SRP but then you would have a lot of "functional classes" you need to keep up with,
any thoughts? thank you!

Yes! You should not have "one class one method".
Responsibility in SRP doesn't mean doing just one single task, it means holding all responsibility in single domain. So doing everything within a single concern, which is not overlapping with another class. You can have one class to do everything with "groups", and one class to do everything with "tasks". This is how things are normally organized.
From Wikipedia:
The single-responsibility principle says that these two aspects of the problem are really two separate responsibilities, and should therefore be in separate classes or modules. ... The reason it is important to keep a class focused on a single concern is that it makes the class more robust.

Integrity of Law of Demeter preserved by using helper function (removed two dots)?

public House
{
WeatherStation station;
public float getTemp() {
//Law of Demeter has been violated here
return station.getThermometer().getTemperature();
}
}
public House
{
WeatherStation station;
public float getTemp() {
//Law of Demeter has been preserved?
Thermometer thermometer = station.getThermometer();
return getTempHelper(thermometer);
}
public float getTempHelper(Thermometer thermometer)
{
return thermometer.getTemperature();
}
}
In the code above you can see two different House class definitions. Both have getTemp() function, first of which violates Law of Demeter, but second one preservs it (according to Head First Design Patterns book).
The trouble is I don't quite get why second class preservs Law of Demeter, getTemp() function still has station.getThermometer() call, which (should?) violates Law of Demeter.
"use only one dot" - I found this on wikipedia, which could be applicable, but I still need more detailed explanation - "In particular, an object should avoid invoking methods of a member object returned by another method" (wiki).
So could anyone explain why the second code example does not violates the law? What truly distinguishes second method from first one?

I imagine there's a lot of discussion that can be had on the subject, but as I interpret it the purpose of the Law Of Demeter would be...
"You don't want to get the Thermometer from the Station. You want to get the Temperature from the Station."
Think of it from a real-life situation. You call up the weather station, you don't ask them, "What does the thermometer on the outside of your building say?" You ask them, "What is the current temperature?" The fact that they have a thermometer attached to the outside of their building isn't your concern. Maybe they replace the thermometer with an infrared laser pointed at a window. It doesn't matter to you. How they come by their information isn't your concern, you just want the information.
So, to that end, you'd end up with something like this:
public House
{
private WeatherStation _station;
public House(WeatherStation station)
{
_station = station;
}
public float GetTemperature()
{
return _station.GetTemperature();
}
}
public WeatherStation
{
private Thermometer _thermometer;
public WeatherStation(Thermometer thermometer)
{
_thermometer = thermometer;
}
public float GetTemperature()
{
return _thermometer.GetTemperature();
// This can be replaced with another implementation, or any other
// device which implements IThermometer, or a hard-coded test value, etc.
}
}
This leads to a few levels of nesting, which does appear to be a little distasteful. But keep in mind that each level, while currently called the exact same thing, means something slightly different. It's not really code duplication if the duplicated code has a different meaning. You could later break the chain with something like this:
public House
{
private WeatherStation _station;
public House(WeatherStation station)
{
_station = station;
}
public WeatherInfoDTO GetCurrentWeather()
{
var info = new WeatherInfoDTO();
info.Temperature = _station.GetTemperature();
//...
return info;
}
}
public WeatherInfoDTO
{
//...
}
public WeatherStation
{
private Thermometer _thermometer;
public WeatherStation(Thermometer thermometer)
{
_thermometer = thermometer;
}
public float GetTemperature()
{
return _thermometer.GetTemperature();
// This can be replaced with another implementation, or any other
// device which implements IThermometer, or a hard-coded test value, etc.
}
//...
}
By not hard-coding the top-level to the implementation of a Thermometer you allow for easy refactoring to support something like this.

It's only by the most strict definition of the law that the 2nd isn't in violation. In my opinion, its "legality is dubious" :), because you haven't properly abstracted away the caller's knowledge that the station uses a thermometer to obtain the temperature. Instead of the helper, I'd prefer to add a getTemperature() method to the station, encapsulating its use of a thermometer there.
In other words, both examples are aware of the station's implementation details, so removing the station's getThermometer() method will break both examples. To say the second is better kinda violates the spirit of the law, in my opinion.

Abstract syntax tree construction and traversal

I am unclear on the structure of abstract syntax trees. To go "down (forward)" in the source of the program that the AST represents, do you go right on the very top node, or do you go down? For instance, would the example program
a = 1
b = 2
c = 3
d = 4
e = 5
Result in an AST that looks like this:
or this:
Where in the first one, going "right" on the main node will advance you through the program, but in the second one simply following the next pointer on each node will do the same.
It seems like the second one would be more correct since you don't need something like a special node type with a potentially extremely long array of pointers for the very first node. Although, I can see the second one becoming more complicated than the first when you get into for loops and if branches and more complicated things.

The first representation is the more typical one, though the second is compatible with the construction of a tree as a recursive data structure, as may be used when the implementation platform is functional rather than imperative.
Consider:
This is your first example, except shortened and with the "main" node (a conceptual straw man) more appropriately named "block," to reflect the common construct of a "block" containing a sequence of statements in an imperative programming language. Different kinds of nodes have different kinds of children, and sometimes those children include collections of subsidiary nodes whose order is important, as is the case with "block." The same might arise from, say, an array initialization:
int[] arr = {1, 2}
Consider how this might be represented in a syntax tree:
Here, the array-literal-type node also has multiple children of the same type whose order is important.

Where in the first one, going "right"
on the main node will advance you
through the program, but in the second
one simply following the next pointer
on each node will do the same.
It seems like the second one would be
more correct since you don't need
something like a special node type
with a potentially extremely long
array of pointers for the very first
node
I'd nearly always prefer the first approach, and I think you'll find it much easier to construct your AST when you don't need to maintain a pointer to the next node.
I think its generally easier to have all objects descend from a common base class, similar to this:
abstract class Expr { }
class Block : Expr
{
Expr[] Statements { get; set; }
public Block(Expr[] statements) { ... }
}
class Assign : Expr
{
Var Variable { get; set; }
Expr Expression { get; set; }
public Assign(Var variable, Expr expression) { ... }
}
class Var : Expr
{
string Name { get; set; }
public Variable(string name) { ... }
}
class Int : Expr
{
int Value { get; set; }
public Int(int value) { ... }
}
Resulting AST is as follows:
Expr program =
new Block(new Expr[]
{
new Assign(new Var("a"), new Int(1)),
new Assign(new Var("b"), new Int(2)),
new Assign(new Var("c"), new Int(3)),
new Assign(new Var("d"), new Int(4)),
new Assign(new Var("e"), new Int(5)),
});

It depends on the language. In C, you'd have to use the first form to capture the notion of a block, since a block has a variable scope:
{
{
int a = 1;
}
// a doesn't exist here
}
The variable scope would be an attribute of what you call the "main node".

I believe your first version make more sense, for a couple of reasons.
Firstly, the first more clearly demonstrates the "nestedness" of the program, and also is clearly implemented as a rooted tree (which is the usual concept of a tree).
The second, and more important reason, is that your "main node" could really have been a "branch node" (for example), which can simply be another node within a larger AST. This way, your AST can be viewed in a recursive sense, where each AST is a node with other ASTs as it children. This make the design of the first much simpler, more general, and very homogeneous.

Suggestion: When dealing with tree data structures, wheter is compiler-related AST or other kind, always use a single "root" node, it may help you perform operations and have more control:
class ASTTreeNode {
bool isRoot() {...}
string display() { ... }
// ...
}
void main ()
{
ASTTreeNode MyRoot = new ASTTreeNode();
// ...
// prints the root node, plus each subnode recursively
MyRoot.Show();
}
Cheers.

Saving and Retrieving Entities of different types using LINQtoSQL

Disclaimer: Bit of a C# newbie - first Software Dev gig in awhile after being in QA for a couple years.
I realize flavors of this question have been asked before (inheritance in LINQtoSQL and the like), but I'm hoping I ask the question differently.
In my database, I will have a super-type of "Event" and multiple sub-types: Conference, Meeting and Appointment, for example.
Event
Id (PK)
TypeId (FK EventTypes.Id)
Title
Conference
Id (PK, FK Event.Id)
Cost
Topic
Meeting
Id (PK, FK Event.Id)
Location
Organizer
Appointment
Id (PK, FK Event.Id)
Time
Address
I am using Rob Conery's MVC StoreFront application as a reference. He essentially gets data from the database and creates class objects, manually mapping Event.Id to db.Event.Id, etc.
I'd like to do this with my Events data model - I'd like to retrieve all Events, and have a LINQ expression dynamic enough to create various event types based on some criteria (TypeId, for example).
var result = from e in db.Events
select new IEvent
{
// Little help? ;)
};
It would be great to find a way to make it so each Event Type knows how to save itself and retrieve itself - I fear having to write the same code for each type, only varying the fields. Make sense?
I did see a question posed and someone answered with something like:
public bool Save<T>() {}
The problem is, I'm not sure where to put this code. I'm also not sure if I should use an IEvent interface or an Event partial class.
I will now end this monster question with an advanced Thank You to those that can offer help/suggestions.
--
EDIT: Good progress - going from DB to Views all with IEvent :) (This Question Helped A Lot)
public class SqlEventRepository : IEventRepository
public IQueryable<IEvent> getEvents() {
// Get Events without doing a query for each type..possible?
var eventType1 = {query};
var eventType2 = {query};
return {concat of all queries};
}
public bool SaveEvent(IEvent) {
// Can I avoid calling a save function for each type of event?
}

You could have a helper class to put your Save<T>() method in. Something like SaveEvents class.
When you want to save using LINQ I'm not so sure that you can use generics as you don't know what T is and therefore cannot update properties in your queries.
I'd use inheritance and then where you'd pass a sub-class, use the parent class (Event) as your argument. Then you can quite easily cast to your subclasses to access those properties in your LINQ Queries.
EDIT:
Something like this:
public class Event : IEvent (Interface implement common properties to all Event type classes)
{
// your code
}
public class MeetingEvent : IEvent
{
public string MeetingEvent
{
get;
set;
}
// add IEvent implementation...
}
public class EventManager
{
public void MyMethod (IEvent event)
{
MeetingEvent Mevent = (MeetingEvent)event;
Mevent.DoSomework();
// now you can cast all eventclasses to and from IEvent passed as a parameter.
}
}

How many variables are too much for a class?

I want to see if anyone has a better design for a class (class as in OOP) I am writing. We have a script that puts shared folder stats in a CSV file. I am reading that in and putting it in a Share class.
My boss wants to know information like:
Total Number of Files
Total Size of Files
Number of Office Files
Size of Office Files
Number of Exe Files
Size of Exe Files
etc ....
I have a class with variables like $numOfficeFiles, $sizeOfficeFiles, etc. with a ton of get/set methods. Isn't there a better way to do this? What is the general rule if you have a class with a lot of variables/properties?
I think of this as a language agnostic question, but if it matters, I am using PHP.

Whenever I see more than 5 or 6 non-final variables in a class I get antsy.
Chances are that they should probably be placed in a smaller class as suggested by Outlaw Programmer. There's also a good chance it could just be placed in a hashtable.
Here's a good rule of thumb: If you have a variable that has nothing but a setter and a getter, you have DATA, not code--get it out of your class and place it into a collection or something.
Having a variable with a setter and a getter just means that either you never do anything with it (it's data) or the code that manipulates it is in another class (terrible OO design, move the variable to the other class).
Remember--every piece of data that is a class member is something you will have to write specific code to access; for instance, when you transfer it from your object to a control on a GUI.
I often tag GUI controls with a name so I can iterate over a collection and automatically transfer data from the collection to the screen and back, significantly reducing boilerplate code; storing the data as member variables makes this process much more complicated (requires reflection).

Sometimes, data can be just data:
files = {
'total': { count: 200, size: 3492834 },
'office': { count: 25, size: 2344 },
'exe': { count: 30, size: 342344 },
...
}

"A class should do one thing, and do it well"
If you're not breaking this rule, then I'd say there aren't too many.
However it depends.
If by too many you mean 100's, then you might want to break it into a data class and collection as shown in the edit below.
Then you've only one get/set operation, however there are pros and cons to this "lazyness".
EDIT:
On second glance, you've pairs of variables, Count and Size.
There should be another class e.g. FileInfo with count and class, now your frist class just has FileInfo classes.
You can also put file type e.g. "All", "Exe" . . . on the File Info class.
Now the parent class becomes a collection of FileInfo objects.
Personally, I think I'd go for that.

I think the answer is "there's no such thing as too many variables."
But then, if this data is going to be kept for a while, you might just want to put it in a database and make your functions calls to the database.
I assume you don't want to recalculate all these values every time you're asked for them.

Each class' "max variables" count really is a function of what data makes sense for the class in question. If there are truly X different values for a class and all data is related, that should be your structure. It can be a bit tedious to create depending on the language being used, but I wouldn't say there is any "limit" that you shouldn't exceed. It is dictated by the purpose.

Sounds like you might have a ton of duplicate code. You want the # of files and the size of files for a bunch of different types. You can start with a class that looks like this:
public class FileStats
{
public FileStats(String extension)
{
// logic to discover files goes here
}
public int getSize() { }
public int getNumFiles() { }
}
Then, in your main class, you can have an array of all the file types you want, and a collection of these helper objects:
public class Statistics
{
private static final String[] TYPES = { "exe", "doc", "png" };
private Collection<FileStats> stats = new HashSet<FileStats>();
public static void collectStats()
{
stats.clear();
for(String type : TYPES)
stats.add(new FileStats(type));
}
}
You can clean up your API by passing a parameter to the getter method:
public int getNumFiles(String type)
{
return stats.get(type).getNumFiles();
}

There is no "hard" limit. OO design does however have a notion of coupling and cohesion. As long as your class is loosely coupled and highly cohesive I believe that you are ok with as many members/methods as you need.

Maybe I didn't understand the goal, but why do you load all the values into memory by using the variables, just to dump them to the csv file (when?). I'd prefer a stateless listener to the directory and writing values immediately to the csv.

I always try to think of a Class as being the "name of my container" or the "name of the task" that I am going to compute. Methods in the Class are "actions" part of the task.
In this case seems like you can start grouping things together, for example you are repeating the number and the size actions many times. Why not create a super class that other classes inherit from, for example:
class NameOfSuperClass {
public $type;
function __construct($type) {
$this->type = $type;
$this->getNumber();
$this->getSize();
}
public function getNumber() {
// do something with the type and the number
}
public function getSize() {
// do something with the type and the size
}
}
Class OfficeFiles extends NameOfSuperClass {
function __construct() {
$this->_super("office");
}
}
I'm not sure if this is right in PHP, but you get my point. Things will start to look a lot cleaner and better to manage.

Just from what I glanced at:
If you keep an array with all of the file names in it, all of those variables can be computed on the fly.

It's more of a readability issue.
I would wrap all the data into an array. And use just one pair of get/set methods.
Something like:
class Test()
{
private $DATA = array();
function set($what,$data) {
$DATA[$what] = $data;
}
function get($what) {
return $this->DATA[$what];
}
}

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Communication between visitor and visitee - language-agnostic

Related

Is "one method per class" overdoing the Single Responsibility Principle?

Integrity of Law of Demeter preserved by using helper function (removed two dots)?

Abstract syntax tree construction and traversal

Saving and Retrieving Entities of different types using LINQtoSQL

How many variables are too much for a class?

Categories

Resources