How to save and load different types of objects? - language-agnostic

During coding I frequently encounter this situation:
I have several objects (ConcreteType1, ConcreteType2, ...) with the same base type AbstractType, which has abstract methods save and load . Each object can (and has to) save some specific kind of data, by overriding the save method.
I have a list of AbstractType objects which contains various ConcreteTypeX objects.
I walk the list and the save method for each object.
At this point I think it's a good OO design. (Or am I wrong?) The problems start when I want to reload the data:
Each object can load its own data, but I have to know the concrete type in advance, so I can instantiate the right ConcreteTypeX and call the load method. So the loading method has to know a great deal about the concrete types. I usually "solved" this problem by writing some kind of marker before calling save, which is used by the loader to determine the right ConcreteTypeX.
I always had/have a bad feeling about this. It feels like some kind of anti-pattern...
Are there better ways?
EDIT:
I'm sorry for the confusion, I re-wrote some of the text.
I'm aware of serialization and perhaps there is some next-to-perfect solution in Java/.NET/yourFavoriteLanguage, but I'm searching for a general solution, which might be better and more "OOP-ish" compared to my concept.

Is this either .NET or Java? If so, why aren't you using serialisation?

If you can't simply use serialization, then I would still definitely pull the object loading logic out of the base class. Your instinct is correct, leading you to correctly identify a code smell. The base class shouldn't need to change when you change or add derived classes.
The problem is, something has to load the data and instantiate those objects. This sounds like a job for the Abstract Factory pattern.

There are better ways, but let's take a step back and look at it conceptually. What are all objects doing? Loading and Saving. When you get the object from memory, you really don't to have to care whether it gets its information from a file, a database, or the windows registry. You just want the object loaded. That's important to remember because later on, your maintanence programmer will look at the LoadFromFile() method and wonder, "Why is it called that since it really doesn't load anything from a file?"
Secondly, you're running into the issue that we all run into, and it's based in dividing work. You want a level that handles getting data from a physical source; you want a level that manipulates this data, and you want a level that displays this data. This is the crux of N-Tier Development. I've linked to an article that discusses your problem in great detail, and details how to create a Data Access Layer to resolve your issue. There are also numerous code projects here and here.
If it's Java you seek, simply substitute 'java' for .NET and search for 'Java N-Tier development'. However, besides syntactical differences, the design structure is the same.

Related

Get Map value like plain old Javascript objects

I'm new to Immutable.js, so this is a very trivial question.
It looks like I can't get a Map value like with plain old Javascript objects, e.g. myMap.myKey. Apparently I have to write myMap.get('myKey')
I am very surprised by this behavior. Is there a reason for that? Is there any extension to Immutable.js which would allow me to type myMap.myKey?
Came back to elaborate on my comment, but SO doesn't allow that after certain time. Converting it into an answer.
The question you have asked has been reciprocated several times with people who start new with immutable, yours truly included. Its on one of the rants I wrote a while ago.
It starts to make sense when you look at it from immutability perspective. If you expose value types as your own properties, they won't be immutable because they are value types and could be assigned to.
Nonetheless, its frustrating to spread these getters all across your components/views. If you can afford it, you should try to use the Record type. It offers traditional access to members (except in IE 8). Better still, you can extend from this type and add helper getters/setters (e.g. user.getName(), user.setName('thebat') instead of user.get('name')/set('name', 'thebat')) to abstract your model's internal structure from your views. However there are challenges to overcome like nested structures and de-serialization of objects.
If the above is not your cup of tea, I'd recommend swallowing the bitter pill :).
I think you are missing the concept Immutable was build:
Immutable data cannot be changed once created, leading to much simpler
application development, no defensive copying, and enabling advanced
memoization and change detection techniques with simple logic.
Persistent data presents a mutative API which does not update the data
in-place, but instead always yields new updated data.
One way or another you may transform Immutable data structures to plain old JS objects as: myMap.toJS()

Restructuring to avoid accessing components in models

Continuing to work on my port of a CakePHP 1.3 app to 3.0, and have run into another issue. I have a number of areas where functionality varies based on certain settings, and I have previously used a modular component approach. For example, Leagues can have round-robin, ladder or tournament scheduling. This impacts on the scheduling algorithm itself, such that there are different settings required to configure each type, but also dictates the way standings are rendered, ties are broken, etc. (This is just one of 10 areas where I have something similar, though not all of these suffer from the problem below.)
My solution to this in the past was to create a LeagueComponent with a base implementation, and then extend that class as LeagueRoundRobinComponent, LeagueLadderComponent and LeagueTournamentComponent. When controllers need to do anything algorithm-specific, they check the schedule_type field in the leagues table, create the appropriate component, and call functions in it. This still works just fine.
I mentioned that this also affects views. The old solution for this was to pass the league component object from the controller to the view via $this->set. The view can then query it for various functionality. This is admittedly a bit kludgy, but the obvious alternative seems to be extracting all the info the view might require and setting it all individually, which doesn't seem to me to be a lot better. If there's a better option, I'm open to it, but I'm not overly concerned about this at the moment.
The problem I've encountered is when tables need to get some of that component info. The issue at hand is when I am saving my add/edit form and need to deal with the custom settings. In order to be as flexible as possible for the future, I don't have all of these possible setting fields represented in the database, but rather serialize them into a single "custom" column. (Reading this all works quite nicely with a custom constructor and getters.) I had previously done this by loading the component from the beforeSave function in the League model, calling the function that returns the list of schedule-specific settings, extracting those values and serializing them. But with the changes to component access in 3.0, it seems I can no longer create the component in my new beforeMarshal function.
I suppose the controller could "pass" the component to the table by setting it as a property, but that feels like a major kludge, and there must be a better way. It doesn't seem like extending the table class is a good solution, because that would horribly complicate associations. I don't think that custom types are the solution, as I don't see how they'd access a component either. I'm leaning towards passing just the list of fields from the controller to the model, that's more of a "configuration" method. Speaking of configuration, I suppose it could all just go into the central Configure data store, but that's always felt to me like somewhere that you only put "small" data. I'm wondering if there's a better design pattern I could follow that would let the table continue to take care of these implementation details on its own without the controller needing to get involved; if at some point I decide to change from the serialized method to adding all of the possible columns, it would be nice to have those changes restricted to the table class.
Oh, and keep in mind that this list of custom settings is needed in both a view and the table, so whatever solution is proposed will ideally provide a way for both of them to access it, rather than requiring duplication of code.

JSON Circular Structure in Entity Framework and Breeze

I have a horribly nested Entity Framework structure. A Schedule holds multiple defaults and multiple overrides. Each default/override has a reference back the schedule and a "Type". The Type has a reference back to any defaults or overrides it belongs to. It's messy, but I think it's probably the only way I can do what's required.
This data ends up in a browser in the form of Breeze entities. Before I can process these when saving them back at the server, I have to turn them back into JSON, which unsurprisingly trips the dreaded "Uncaught TypeError: Converting circular structure to JSON".
Now there are a number of perfectly good scripts for removing these circular structures. But all of them seem to replace the circular references with some sort of placeholder so they can be re-created as objects. But of course Entity Framework doesn't recognise these, so can't work with them.
I'm at a loss as to what to do at this point. Simply removing the circular references rather than replacing them doesn't seem to help as it can result in structures shorn of important data.
I've also tried changing my EF queries to only get back specifically the data required, but it insists on giving me absolutely everything, even though Lazy Loading is set to false and I have no .Include statements in my queries. But I feel this is solving the wrong problem, since we're bound to want to deal with complex data at some point.
Is there any other way round this?
EDIT: I've solved this temporarily by investigating the object and removing the circular properties by name. But I'd still like a generic solution if at all possible.
Seems like you are after serialization mode. find out serialization mode in properties in your designer screen and set it to unidirectional. this will solve your serialization issue.
Hope that helps!!!
Not sure I understand the question. You should never experience any kind of circularity issue, regardless of the complexity of your model, with a Breeze SaveChanges call. (Breeze internally unwraps all entities and any circularities before serializing). If you are seeing something different then this would be a bug. Is this the case?

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.

DoSomethingToThing(Thing n) vs Thing.DoSomething()

What factors determine which approach is more appropriate?
I think both have their places.
You shouldn't simply use DoSomethingToThing(Thing n) just because you think "Functional programming is good". Likewise you shouldn't simply use Thing.DoSomething() because "Object Oriented programming is good".
I think it comes down to what you are trying to convey. Stop thinking about your code as a series of instructions, and start thinking about it like a paragraph or sentence of a story. Think about which parts are the most important from the point of view of the task at hand.
For example, if the part of the 'sentence' you would like to stress is the object, you should use the OO style.
Example:
fileHandle.close();
Most of the time when you're passing around file handles, the main thing you are thinking about is keeping track of the file it represents.
CounterExample:
string x = "Hello World";
submitHttpRequest( x );
In this case submitting the HTTP request is far more important than the string which is the body, so submitHttpRequst(x) is preferable to x.submitViaHttp()
Needless to say, these are not mutually exclusive. You'll probably actually have
networkConnection.submitHttpRequest(x)
in which you mix them both. The important thing is that you think about what parts are emphasized, and what you will be conveying to the future reader of the code.
To be object-oriented, tell, don't ask : http://www.pragmaticprogrammer.com/articles/tell-dont-ask.
So, Thing.DoSomething() rather than DoSomethingToThing(Thing n).
If you're dealing with internal state of a thing, Thing.DoSomething() makes more sense, because even if you change the internal representation of Thing, or how it works, the code talking to it doesn't have to change. If you're dealing with a collection of Things, or writing some utility methods, procedural-style DoSomethingToThing() might make more sense or be more straight-forward; but still, can usually be represented as a method on the object representing that collection: for instance
GetTotalPriceofThings();
vs
Cart.getTotal();
It really depends on how object oriented your code is.
Thing.DoSomething is appropriate if Thing is the subject of your sentence.
DoSomethingToThing(Thing n) is appropriate if Thing is the object of your sentence.
ThingA.DoSomethingToThingB(ThingB m) is an unavoidable combination, since in all the languages I can think of, functions belong to one class and are not mutually owned. But this makes sense because you can have a subject and an object.
Active voice is more straightforward than passive voice, so make sure your sentence has a subject that isn't just "the computer". This means, use form 1 and form 3 frequently, and use form 2 rarely.
For clarity:
// Form 1: "File handle, close."
fileHandle.close();
// Form 2: "(Computer,) close the file handle."
close(fileHandle);
// Form 3: "File handle, write the contents of another file handle."
fileHandle.writeContentsOf(anotherFileHandle);
I agree with Orion, but I'm going to rephrase the decision process.
You have a noun and a verb / an object and an action.
If many objects of this type will use this action, try to make the action part of the object.
Otherwise, try to group the action separately, but with related actions.
I like the File / string examples. There are many string operations, such as "SendAsHTTPReply", which won't happen for your average string, but do happen often in a certain setting. However, you basically will always close a File (hopefully), so it makes perfect sense to put the Close action in the class interface.
Another way to think of this is as buying part of an entertainment system. It makes sense to bundle a TV remote with a TV, because you always use them together. But it would be strange to bundle a power cable for a specific VCR with a TV, since many customers will never use this. The key idea is how often will this action be used on this object?
Not nearly enough information here. It depends if your language even supports the construct "Thing.something" or equivalent (ie. it's an OO language). If so, it's far more appropriate because that's the OO paradigm (members should be associated with the object they act on). In a procedural style, of course, DoSomethingtoThing() is your only choice... or ThingDoSomething()
DoSomethingToThing(Thing n) would be more of a functional approach whereas Thing.DoSomething() would be more of an object oriented approach.
That is the Object Oriented versus Procedural Programming choice :)
I think the well documented OO advantages apply to the Thing.DoSomething()
This has been asked Design question: does the Phone dial the PhoneNumber, or does the PhoneNumber dial itself on the Phone?
Here are a couple of factors to consider:
Can you modify or extend the Thing class. If not, use the former
Can Thing be instantiated. If not, use the later as a static method
If Thing actually get modified (i.e. has properties that change), prefer the latter. If Thing is not modified the latter is just as acceptable.
Otherwise, as objects are meant to map on to real world object, choose the method that seems more grounded in reality.
Even if you aren't working in an OO language, where you would have Thing.DoSomething(), for the overall readability of your code, having a set of functions like:
ThingDoSomething()
ThingDoAnotherTask()
ThingWeDoSomethingElse()
then
AnotherThingDoSomething()
and so on is far better.
All the code that works on "Thing" is on the one location. Of course, the "DoSomething" and other tasks should be named consistently - so you have a ThingOneRead(), a ThingTwoRead()... by now you should get point. When you go back to work on the code in twelve months time, you will appreciate taking the time to make things logical.
In general, if "something" is an action that "thing" naturally knows how to do, then you should use thing.doSomething(). That's good OO encapsulation, because otherwise DoSomethingToThing(thing) would have to access potential internal information of "thing".
For example invoice.getTotal()
If "something" is not naturally part of "thing's" domain model, then one option is to use a helper method.
For example: Logger.log(invoice)
If DoingSomething to an object is likely to produce a different result in another scenario, then i'd suggest you oneThing.DoSomethingToThing(anotherThing).
For example you may have two was of saving thing in you program so you might adopt a DatabaseObject.Save(thing) SessionObject.Save(thing) would be more advantageous than thing.Save() or thing.SaveToDatabase or thing.SaveToSession().
I rarely pass no parameters to a class, unless I'm retrieving public properties.
To add to Aeon's answer, it depends on the the thing and what you want to do to it. So if you are writing Thing, and DoSomething alters the internal state of Thing, then the best approach is Thing.DoSomething. However, if the action does more than change the internal state, then DoSomething(Thing) makes more sense. For example:
Collection.Add(Thing)
is better than
Thing.AddSelfToCollection(Collection)
And if you didn't write Thing, and cannot create a derived class, then you have no chocie but to do DoSomething(Thing)
Even in object oriented programming it might be useful to use a function call instead of a method (or for that matter calling a method of an object other than the one we call it on). Imagine a simple database persistence framework where you'd like to just call save() on an object. Instead of including an SQL statement in every class you'd like to have saved, thus complicating code, spreading SQL all across the code and making changing the storage engine a PITA, you could create an Interface defining save(Class1), save(Class2) etc. and its implementation. Then you'd actually be calling databaseSaver.save(class1) and have everything in one place.
I have to agree with Kevin Conner
Also keep in mind the caller of either of the 2 forms. The caller is probably a method of some other object that definitely does something to your Thing :)