Does a "thin data access layer" mainly imply writing SQL by hand?

Does a "thin data access layer" mainly imply writing SQL by hand? - terminology

When you say "thin data access layer", does this mainly mean you are talking about writing your SQL manually as opposed to relying on an ORM tool to generate it for you?

That probably depends on who says it, but for me a thin data access layer would imply that there is little to no additional logic in the layer (i.e. data storage abstractions), probably no support for targeting multiple RDBMS, no layer-specific caching, no advanced error handling (retry, failover), etc.
Since ORM tools tend to supply many of those things, a solution with an ORM would probably not be considered "thin". Many home-grown data access layers would also not be considered "thin" if they provide features such as the ones listed above.

Depends on how we define the word "thin". It's one of the most abused terms I hear, rivaled only by "lightweight".
That's one way to define it, but perhaps not the best. An ORM layer does a lot besides just generate SQL for you (e.g., caching, marking "dirty" fields, etc.) That "thin" layer written in lovingly crafted SQL can become pretty bloated by the time you implement all the features an ORM is providing.

I think "thin" in this context means:
It is lightweight;
It has a low performance overhead; and
You write minimal code.
Writing SQL certainly fits this bill but there's no reason it couldn't be an ORM either although most ORMs that spring to mind don't strike me as lightweight.

I think it depends on the context.
It could very well mean that, or it may simply mean that your business objects map directly a simple underlying relational table structure: one table per class, one column per class attribute, so that the translation of business object structure to database table structure is "thin" (i.e. not complex). This could still be handled by an ORM of course.

It may mean that there is no or minimal logic employed on the database such as avoiding the use of stored procedures. As other people have mentioned it depends on the statement's context as to the most likely meaning.

I thought data access were always supposed to be thin... DALs aren't really the place to have logic.
Maybe the person you talked to is talking about a combination of a business layer and a data access layer; where the business layer is non-existent (e.g. a very simple app, or perhaps all of the business rules are in the database, etc).

Related

Is there any reason to use one DataContext instance, instead of several?

For example, I have 2 methods that use one DataContext (Linq to sql).
using(DataContext data = new DataContext){
// doing something
another_datamethod(data);
}
void another_datamethod(DataContext data){
// doing
}
Use this style? Or with the same result, I can create separate "using DataContext". What benefits, I would achieve if i'll use one DataContext? Maybe some cache possibilities?

Recently, I've read numerous articles and blogs that "highly recommend" that you use multiple DataContexts for your applications, due to multiple issues including the creation of records associated with lookup tables. When I was learning LINQ-to-SQL, one of the most attractive qualities of it for me was the ability to import my complete database schema into one "big" DataContext. So, that's what I did...but a few months, in comes the contradictory information saying that what I did was a bad thing. What to do, what to do...
Nine months later, here's where I stand. My single large DataContext is still my single large DataContext. I have over thirty data repository classes accessing the sixty-plus tables contained within, and I still haven't seen a valid reason to break up my existing data-dom, or to not handle the next project using a single DataContext. The problems that the article and blog writers experienced were valid problems. However, like most things technical, there's never just one way to do things. The best investment of my time and energy was to learn and truly understand how LINQ-to-SQL does what it does. The best book that I found to help me do exactly that is Pro LINQ: Language Integrated Query in C# 2008 by Joseph C. Rattz, Jr. The LINQ-to-SQL coverage is detailed and clear, and there are plenty of examples to clarify the mystery.
So, in your case, create one big DataContext or create many smaller ones...the choice is up to you. Smaller ones clearly give better opportunity for reuse, while one big one helps increase the time you can focus on business logic and presentation code.

Datacontexts track changes and do caching, so yes caching is a possibility depending on what work you are performing.

At what point should architecture become layered?

Obviously, "Hello World" doesn't require a separated, modular front-end and back-end. But any sort of Enterprise-grade project does.
Assuming some sort of spectrum between these points, at which stage should an application be (conceptually, or at a design level) multi-layered? When a database, or some external resource is introduced? When you find that the you're anticipating spaghetti code in your methods/functions?

when a database, or some external resource is introduced.
but also:
always (except for the most trivial of apps) separate AT LEAST presentation tier and application tier
see:
http://en.wikipedia.org/wiki/Multitier_architecture

Layers are a mean to keep a design loosely coupled and highly cohesive.
When you start to have a few classes (either implemented or just sketched with UML), they can be grouped logically, into layers - or more generally packages, or modules. This is called the art of separating the concerns.
The sooner the better: if you do not start layering early enough, then you risk to have never do it as the effort can be too important.

Here are some criteria of when to...
Any time you anticipate the need to
replace one part of it with a
different part.
Any time you find
yourself need to divide work amongst
parallel team.

There is no real answer to this question. It depends largely on your application's needs, and numerous other factors. I'd suggest reading some books on design patterns and enterprise application architecture. These two are invaluable:
Design Patterns: Elements of Reusable Object-Oriented Software
Patterns of Enterprise Application Architecture
Some other books that I highly recommend are:
The Pragmatic Programmer: From Journeyman to Master
Refactoring: Improving the Design of Existing Code
No matter your skill level, reading these will really open your eyes to a world of possibilities.

I'd say in most cases dealing with multiple distinct levels of abstraction in the concepts your code deals with would be a strong signal to mirror this with levels of abstraction in your implementation.
This does not override the scenarios that others have highlighted already though.

I think once you ask yourself "hmm should I layer this" the answer is yes.
I've worked on too many projects that probably started off as proof of concept/prototype that ended up being full projects used in production, which are horribly written and just wreak of "get it done quick, we'll fix it later." Trust me, you wont fix it later.
The Pragmatic Programmer lists this as the Broken Window Theory.
Try and always do it right from the start. Separate your concerns. Build it with modularity in mind.
And of course try and think of the poor maintenance programmer who might take over when you're done!

Thinking of it in terms of layers is a little limiting. It's what you see in whitepapers about a product, but it's not how products really work. They have "boxes" that depend on each other in various ways, and you can make it look like they fit into layers but you can do this in several different configurations, depending on what information you're leaving out of the diagram.
And in a really well-designed application, the boxes get very small. They are down to the level of individual interfaces and classes.
This is important because whenever you change a line of code, you need to have some understanding of the impact your change will have, which means you have to understand exactly what the code currently does, what its responsibilities are, which means it has to be a small chunk that has a single responsibility, implementing an interface that doesn't cause clients to be dependent on things they don't need (the S and the I of SOLID).
You may find that your application can look like it has two or three simple layers, if you narrow your eyes, but it may not. That isn't really a problem. Of course, a disastrously badly designed application can look like it has layers tiers if you squint as hard as you can. So those "high level" diagrams of an "architecture" can hide a multitude of sins.

My generic rule of thumb is to at least to separate the problem into a model and view layer, and throw in a controller if there is a possibility of more than one ways of handling the model or piping data to the view.
(Or as the first answer, at least the presentation tier and the application tier).

Loose coupling is all about minimising dependencies, so I would say 'layer' when a dependency is introduced. i.e. a database, third party application, etc.
Although 'layer' is probably the wrong term these days. Most of the time I use Dependency Injection (DI) through an Inversion of Control container such as Castle Windsor. This means that I can code on one part of my system without worrying about the rest. It has the side effect of ensuring loose coupling.
I would recommend DI as a general programming principle all of the time so that you have the choice on how to 'layer' your application later.
Give it a look.
R

How often do you need to create a real class hierarchy in your day to day programming?

I create business applications with heavy database use. Most of the programming work is just to connect components to the database and modifying components to adapt to general interface behaviour. I mostly use Delphi with its rich VCL library, and generally buy components needed. I keep most of the business logic in the database. I rarely get the chance to build a nice class hierarchy from the bottom up as there really is no need. Anyone else have this experience?

For me, occasionally a problem is clearer or easier with subclassing, but not often.
This also changes quite a bit in a given design as it's refactored.
My biggest problem is that programming courses and texts give so much weight to inheritance, hierarchies, and polymorphism through base classes (vs. interfaces or dynamic typing). This helps create legions of programmers that subclass everything and their mother.

The answer to this question is not totally language-agnostic;
Some languages like Java have a fairly limited set of language features available, meaning that subclassing is fairly often used because it's a convenient method for re-use, technical inheritance.
Closures and lambdas of C# make inheritance for technical reasons much less relevant. So normally inheritance is used for semantic reasons (like cat extends animal).
The last C# project I worked on, we more or less made all of the class hierarchies within a few weeks. After that it was more or less over.
On my current java project we create new class hierarchies all of the time.
Other languages will have other features that similarly affect this composition (mixins come to mind)

I put on my architecting/class design hat probably once or twice a month. It's probably the best hat I have and is the most fun to wear.
Depends what stage of the lifecycle your project is in though.

When your tackling problem domains you are well familiar with and already have a common code base to work from, you often have no need to create a new class hierarchy. It's when you stumble upon problems you have no ready solutions for, that you start building your own.
It's also very dependant on the type of applications you develop. If your domain already has well accepted conventions and libraries to work from, there probably isn't any need to reinvent the wheel (other than personal / academic interests). Some areas have inherently less available resources to work with, and in those you'll find yourself building everything from scratch most of the time.

A majority of applications, especially business applications, contains at least some kind of business logic in it. I would contend that business should not be in the database, but should rather be in the application. You can put referential integrity in the database as I think this is a good choice, but business logic should be only in the application.
By class hierarchy, I suppose you mean do you always have to end up with some inheritance in your object model, then the answer is no. But chances are you can often find some common code, factor it out and create a base class to contain the common code.
If you agree with me on the point that business logic should not be in the database, but should be in the application, then I recommend you look into the MVC Design Pattern to guide your design. You will find your design contain classes or objects. Your VCLs will represent your View, and you can have your Model classes map directly to the database table, i.e. each member in the class in the model corresponds to a field in a database table (again, this is the norm but there will be exception, where this simplicity fails to apply). Then you'll need a layer to handle the CRUD (Create, Read, Update, Delete) of the Model classes to the database tables. You will end up with an "layered" application that is easier to maintain and enhance.

It depends on what you mean by hierarchy - inheritance or layering?
When object oriented languages first came out, inheritance was overused. Complicated hierarchies were common. Now, interfaces (as in Java and C#) provide a simpler way to get the benefit of polymorphism without the complications of inheritance. I rarely use inheritance anymore.
Layering, however, is vital when creating a large application. Layering prevents general low-level classes (like lists) from directly referencing specific high-level classes (like web browser windows). As far as I know, there isn't a formal way to describe layering, but there are general guidelines (model-view-controller (MVC), separate GUI logic from business logic, separate data from presentation, etc.).

It really depends on the types/phases of the projects you're working on. I happen to do that everyday because I'm working on database internals for a new database, creating related libraries/frameworks. I'd imagine doing that a lot less if I'm working within a mature framework using other people's libraries.

I'm doing Infrastructure for our companys' product, so I'm writing a lot of code that will be used later by guys in other teams. So I end up writing lots of abstract classes, interfaces, hierarchies and so on. Mostly it's just a pattern of "default behaviour in an abstract/virtual class, which other programmers may override".
Very challenging, I must say.

The time that I find class hierarchies most beneficial is when the relationship between objects actually does match a true "is-a" relationship in the domain.
However if I can avoid large hierarchies I will due to the fact that they are often a little more tricky to map to relational databases and can really complicate your database designs. Since you say most of your applications make heavy use of databases this would be something to take into consideration.

Object Normalization

In the same line as Database Normalization - is there an approach to object normalization, not design pattern, but the same mathematical like approach to normalizing object creation. For example: first normal form: no repeating fields....
here's some links to DB Normalization:
http://en.wikipedia.org/wiki/Database_normalization
http://databases.about.com/od/specificproducts/a/normalization.htm
Would this make object creation and self-documentation better?
Here's a link to a book about class normalization (guess we're really talking about classes)
http://www.agiledata.org/essays/classNormalization.html

Normalization has a mathematical foundation in predicate logic, and a clear and specific goal that the same piece of information never be represented twice in a single model; the purpose of this goal is to eliminate the possibility of inconsistent information in a data model. It can be shown via mathematical proof that if a data model has certain specific properties (that it passes tests for 1st Normal Form (1NF), 2NF, 3NF, etc.) that it is free from redundant data representation, i.e. it is Normalized.
Object orientation has no such underlying mathematical basis, and indeed, no clear and specific goal. It is simply a design idea for introducing more abstraction. The DRY principle, Command-Query Separation, Liskov Substitution Principle, Open-Closed Principle, Tell-Don't-Ask, Dependency Inversion Principle, and other heuristics for improving quality of code (many of which apply to code in general, not just object oriented programs) are not absolute in nature; they are guidelines that programmers have found useful in improving understandability, maintainability, and testability of their code.
With a relational data model, you can say with absolute certainty whether it is "normalized" or not, because it must pass ALL the tests for normal form, and they are quite specific. With an object model, on the other hand, because the goal of "understandable, maintainable, testable, etc" is rather vague, you cannot say with any certainty whether you have met that goal. With many of the design heuristics, you cannot even say for sure whether you have followed them. Have you followed the DRY principle if you're applying patterns to your design? Surely repeated use of a pattern isn't DRY? Furthermore, some of these heuristics or principles aren't always even necessarily good advice all the time. I do try to follow Command-Query Separation, but such useful things as a Stack or a Queue violate that concept in order to give us a rather elegant and useful result.

I guess the Single Responsible Principle is at least related to this. Or at least, violation of the SRP is similar to a lack of normalization in some ways.
(It's possible I'm talking rubbish. I'm pretty tired.)

Interesting.
You may also be interested in looking at the Law of Demeter.
Another thing you may be interested in is c2's FearOfAddingClasses, as, arguably, the same reasoning that lead programmers to denormalise databases also leads to god classes and other code smells. For both OO and DB normalisation, we want to decompose everything. For databases this means more tables, for OO, more classes.
Now, it is worth bearing in mind the object relational impedance mismatch, that is, probably not everything will translate cleanly.
Object relational models or 'persistence layers', usually have 1-to-1 mappings between object attributes and database fields. So, can we normalise? Say we have department object with employee1, employee2 ... etc. attributes. Obviously that should be replaced with a list of employees. So we can say 1NF works.
With that in mind, let's go straight for the kill and look at 6NF database design, a good example is Anchor Modeling, (ignore the naming convention). Anchor Modeling/6NF provides highly decomposed and flexible database schemas; how does this translate to OO 'normalisation'?
Anchor Modeling has these kinds of relationships:
Anchors - unique object IDs.
Attributes, which translate to object attributes: (Anchor, value, metadata).
Ties - relationships between two or more objects (themselves anchors): (Anchor, Anchor... , metadata)
Knots, attributed Ties.
Attribute metadata can be anything - who changed an attribute, when, why, etc.
The OO translation of this is looks extremely flexible:
Anchors suggest attribute-less placeholders, like a proxy which knows how to deal with the attribute composition.
Attributes suggest classes representing attributes and what they belong to. This suggests applying reuse to how attributes are looked up and dealt with, e.g automatic constraint checking, etc. From this we have a basis to generically implement the GOF-style Structural patterns.
Ties and Knots suggest classes representing relationships between objects. A basis for generic implementation of the Behavioural design patterns?
Interesting and desirable properties of Anchor Modeling that also translate across are:
All this requires replacing inheritance with composition (good) in the exposed objects.
Attribute have owners rather than owners having attributes. Although this make attribute lookup more complex, it neatly solves certain aliasing problems, as there can only ever be one owner.
No need for NULL. This translates to clearer NULL handling. Empty-case attribute classes could provide methods for handling the lack of a particular attribute, instead of performing NULL-checking everywhere.
Attribute metadata. Attribute-level full historisation and blaming: 'play' objects back in time, see what changed, when and why, etc. (if required - metadata is entirely optional)
There would probably be a lot of very simple classes (which is good), and a very declarative programming style (also good).
Thanks for such a thought provoking question, I hope this is useful for you.

Perhaps you're taking this from a relational point-of-view, but I would posit that the principles of interfaces and inheritance correspond to normalization in the world of OOP.
For example, a Person abstract class containing FirstName, LastName, Gender and BirthDate can be used by classes such as Employee, User, Member etc. as a valid base class, without a need to repeat the definitions of those attributes in such subclasses.
The principle of DRY, (a core principle of Andy Hunt and Dave Thomas's book The Pragmatic Programmer), and the constant emphasis of object-oriented programming on re-use, also correspond to the efficiencies offered by Normalization in relational databases.

At first glance, I'd say that the objectives of Code Refactoring are similar in an abstract way to the objectives of normalization. But that's pretty abstract.
Update: I almost wrote earlier that "we need to get Jon Skeet in on this one." I posted my answer and who beat me? You guessed it...

Object Role Modeling (not to be confused with Object Relational Mapping) is the closest thing I know of to normalization for objects. It doesn't have as mathematical a foundation as normalization, but it's a start.

In a fairly ad-hoc and untutored fashion, that will probably cause purists to scoff, and perhaps rightly, I think of a database table as being a set of objects of a particular type, and vice versa. Then I take my thoughts from there. Viewed this way, it doesn't seem to me like there's anything particularly special you have to do to use normal form in your everyday programming. Each object's identity will do for starters as its primary key, and references (pointers, etc.) will do by way of foreign keys. Then just follow the same rules.
(My objects usually end up in 3NF, or some approximation thereof. I treat this all more as guidelines, and, like I said, "untutored".)
If the rules are followed properly, each bit of information then ends up in one place, the interrelationships are clear, and everything is structured such that introducing inconsistencies takes some work. One could say that this approach produces good results on this basis, and I would agree.
One downside is that the end result can feel a bit like a tangle of spaghetti, particularly after some time away, and it's hard to shake the constant lingering sensation, even though it's usually false, that surely a few of all these links could be removed...

Object oriented design is rational but it does not have the same mathematically well-defined basis as the Relational Model. There is nothing exactly equivalent to the well-defined normal forms of database design.
Whether this is a strength or a weakness of Object oriented design is a matter of interpretation.

I second the SRP. The Open Closed Principle applies as well to "normalization" although I might stretch the meaning of the word, in that it should be possible to extend the system by adding new implementations, without modifying the existing code. objectmentor about OCP

good question, sorry i can't answer in depth
I've been working on object normalization off and on for over 20 years. It's deep and complicated and beautiful, and is the subject of my second planned book, Object Mechanics II. ONF = Object Normal Form, you heard it here first! ;-)
since potentially patentable technology lurks within, I am not at liberty to say more, except that normalizing the data is the really easy part ;-)
ADDENDUM: change of plans - see https://softwareengineering.stackexchange.com/questions/84598/object-oriented-normalization

Can a Sequence Diagram realistically capture your logic in the same depth as code?

I use UML Sequence Diagrams all the time, and am familiar with the UML2 notation.
But I only ever use them to capture the essence of what I intend to do. In other words the diagram always exists at a level of abstraction above the actual code. Every time I use them to try and describe exactly what I intend to do I end up using so much horizontal space and so many alt/loop frames that its not worth the effort.
So it may be possible in theory but has anyone every really used the diagram in this level of detail? If so can you provide an example please?

I have the same problem but when I realize that I am going low-level I re-read this:
You should use sequence diagrams
when you want to look at the behavior
of several objects within a single use
case. Sequence diagrams are good at
showing collaborations among the
objects; they are not so good at
precise definition of the behavior.
If you want to look at the behavior of
a single object across many use cases,
use a state diagram. If you want
to look at behavior across many use
cases or many threads, consider an
activity diagram.
If you want to explore multiple
alternative interactions quickly, you
may be better off with CRC cards,
as that avoids a lot of drawing and
erasing. It’s often handy to have a
CRC card session to explore design
alternatives and then use sequence
diagrams to capture any interactions
that you want to refer to later.
[excerpt from Martin Fowler's UML Distilled book]

It's all relative. The law of diminishing returns always applies when making a diagram. I think it's good to show the interaction between objects (objectA initializes objectB and calls method foo on it). But it's not practical to show the internals of a function. In that regard, a sequence diagram is not practical to capture the logic at the same depth as code. I would argue for intricate logic, you'd want to use a flowchart.

I think there are two issues to consider.
Be concrete
Sequence diagrams are at their best when they are used to convey to a single concrete scenario (of a use case for example).
When you use them to depict more than one scenario, usually to show what happens in every possible path through a use case, they get complicated very quickly.
Since source code is just like a use case in this regard (i.e. a general description instead of a specific one), sequence diagrams aren't a good fit. Imagine expanding x levels of the call graph of some method and showing all that information on a single diagram, including all if & loop conditions..
That's why 'capturing the essence' as you put it, is so important.
Ideally a sequence diagram fits on a single A4/Letter page, anything larger makes the diagram unwieldy. Perhaps as a rule of thumb, limit the number of objects to 6-10 and the number of calls to 10-25.
Focus on communication
Sequence diagrams are meant to highlight communication, not internal processing.
They're very expressive when it comes to specifying the communication that happens (involved parties, asynchronous, synchronous, immediate, delayed, signal, call, etc.) but not when it comes to internal processing (only actions really)
Also, although you can use variables it's far from perfect. The objects at the top are, well, objects. You could consider them as variables (i.e. use their names as variables) but it just isn't very convenient.
For example, try depicting the traversal of a linked list where you need to keep tabs on an element and its predecessor with a sequence diagram. You could use two 'variable' objects called 'current' and 'previous' and add the necessary actions to make current=current.next and previous=current but the result is just awkward.

Personally I have used sequence diagrams only as a description of general interaction between different objects, i.e. as a quick "temporal interaction sketch". When I tried to get more in depth, all quickly started to be confused...
I've found that the best compromise is a "simplified" sequence diagram followed by a clear but in depth description of the logic underneath.

The answer is no - it does capture it better then your source code!
At least in some aspects. Let me elaborate.
You - like the majority of the programmers, including me - think in source code lines. But the software end product - let's call it the System - is much more than that. It only exists in the mind of your team members. In better cases it also exists on paper or in other documented forms.
There are plenty of standard 'views' to describe the System. Like UML Class diagrams, UML activity diagrams etc. Each diagram shows the System from another point of view. You have static views, dynamic views, but in an architectural/software document you don't have to stop there. You can present nonstandard views in your own words, e.g. deployment view, performance view, usability view, company-values view, boss's favourite things view etc.
Each view captures and documents certain properties of the System.
It's very important to realize that the source code is just one view. The most important though because it's needed to generate a computer program. But it doesn't contain every piece of information of your System, nor explicitly nor implicitly. (E.g. the shared data between program modules, what are only connected via offline user activity. No trace in the source). It's just a static view which helps very little to understand your processes, the runtime dynamics of your living-breathing program.
A classic example of the Observer pattern. Especially if it used heavily, you'll hardly understand the System mechanis from the source code. That's why you use Sequence diagrams in that case. It captures the 'dynamic logic' of your system a lot better than your source code.
But if you meant some kind of business logic in great detail, you are better off with plain text/source code/pseudocode etc. You don't have to use UML diagrams just because they are the standard. You can use usecase modeling without drawing usecase diagrams. Always choose the view what's the best for you and for your purpose.

U.M.L. diagrams are guidelines, not strictly rules.
You don't have to make them exactly & detailed as the source code, but, you may try it, if you want it.
Sometimes, its possible to do it, sometimes, its not possible, because of the detail or complexity of systems, or don't have the time or details to do it.
Cheers.
P.D.
Any cheese-burguer or tuna-fish-burguer for the cat ?

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008