LinqToSql Best Practices - linq-to-sql

I just started creating my data-access layer using LinqToSql. Everybody is talking about the cool syntax and I really like Linq in general.
But when I saw how your classes get generated if you drag some tables on the LinqContext I was amazed: So much code which nobody would need?!
So I looked how other people used LinqToSql, for example Rob Connery at his StoreFront Demo.
As I don't like the way that all this code is generated I created my domain layer by hand and used the generated classes as reference. With that solution I'm fine, as I can use the features provided by Linq (dereferred execution, lazy loading, ...) and my domain layer is quite easy to understand.
How are you using LinqToSql?

The created classes are not as heavy as it seems. Of course it takes quite a few lines of code, but all in all it is as lightweight as it can be for the features it's providing.
I used to create my own tables, too, but now instead I just use the LINQtoSQL DataContext. Why? Creating is simpler, the features are better, interoperability works, it is probably even faster than my own stuff (not in every aspect. Usually my own stuff was extremely fast in one thing, but the generic stuff was faster in everything else).
But the most important part: it is easier to bring new developers into the LINQ stuff than into my own. There are tutorials, example codes, documentation, everything, which I'd have to create for my code by myself. Same with using my stuff with other technology, like WCF or data binding. There are a lot of pitfalls to be taken care of.
I learned not to develop myself into a corner the hard way, it looks fast and easy at the start, is a lot more fun than learning how to use the libs, but is a real pain in the a after a few months down the road, usually even for myself.
After some time the novelty of creating my own data containers wears off, and I noticed the pain connected to adding a feature. A feature I would have had for free, if I had used the provided classes.
Next thing I had to explain my code to some other programmer. Had I used the provided classes, I could have pointed him to some web site to learn about the stuff. But for my classes, I had to train him himself, which took a long time and made it hard to get new people on a project.

LinqToSql generates a set of partial classes for your tables. You can add interface definitions to the 'other half' of these partial classes that implement your domain model.
Then, if you use the repository pattern to wrap access to the Linq queries, so that they return interface implementations of your objects (the underlying Linq objects), LinqToSql becomes quite flexible.

You can hand-write your own classes and either use the LINQ to SQL attributes to declare the mappings or an external XML file.
If you would like to stick with the existing designer and just modify the code generation process pick up my templates that let you tailor the generated code.

Use compiled queries. Linq to SQL is dog slow otherwise. Really.

We use our hand crafted domain model, along with the generated classes, plus a simple utility which utilizes reflection to convert between them when needed. We've contemplated writing a converter generator if we reach a point where reflection creates a performance bottleneck.

Related

What's the proper way to program in AS3?

I've read a lot of books and watched videos on AS3 and they all teach very interesting techniques that I can harness and use. However, I'm in slight confusion because I've seen different techniques that contradict each other from different sources. For example, I've seen some developers write all their code within the timeline and handle it that way. Other times, I've seen developers handle their code in an .as file in Flex/Flash Builder/FlashDevelop. I know that there is no "right or wrong" way to do it, but what is the more preferred way by professionals?
As of now, I just use my .FLA to hold my assets and I write all my code in .AS.
Major projects shoulds strive to avoid timeline code at all times, since it will quickly become very hard to maintain, understand and version control. In these projects FLA's are mostly used to hold library assets, which are then linked into the project via swc-files.
It's fine to use FLA timeline code for small stuff like banners though.
Absolutely try to avoid timeline code if you can. It can really become a nightmare to read or expand. Using classes is a great way to start branching out of flash as well. As you get comfortable with referencing external files it is important to start using design patterns such as MVC (Model View Controller. It makes maintenance, debugging, expansion and handing off projects easier.
Proper way is to use Object oriented programming (OOP) with design patterns (DP).
EDIT:
From Wikipedia:
Design pattern is a general reusable solution to a commonly occurring problem within a given context in software design.
In my opinion design patterns are integral part of OOP (without DP you loose most of the advantages of Object oriented language).
You can learn design patterns for AS3 from this great book: ActionScript 3.0 Design Patterns: Object Oriented Programming Techniques. This book has an active blog with lots of interesting articles.

Is there a good DSL for manipulating MySQL scripts independent of any particular web framework?

I have a simple MySQL script that I use in a web application to complete rebuild/reset my DB to a clean initial state. Thus, in this script I define the various tables, stored procs, etc. that I need.
This is fairly good initial solution b/c it's simple and does the job without being overkill. However there are some drawbacks. One example is typing. It would be nice to define stored procs with richer types so I don't need to repeat declarations like VARCHAR(64).
Thus, my question is: is there a good DSL for manipulating MySQL scripts? (e.g. it could ultimately generate valid MySQL scripts) that is effectively a nice DSL over MySQL, without trying to do too much and have too many bells and whistles. Would be nice if the language itself had decent support for DSL, but more importantly, it would be nice to find something that wasn't heavily wedded to a particular web framework.
Some cursory searches did not yield anything immediately obvious.
I guess one practical alternative is to just use your favorite ORM as a way of getting at a solution that's effectively nice. So part of the motivation of this question is to see if the DSL approach has been explored to any success.
I'm assuming you mean an Internal DSL (see http://martinfowler.com/bliki/DomainSpecificLanguage.html, and http://en.wikipedia.org/wiki/Domain-specific_language) because SQL is a DSL, i.e. an External DSL (by Martin Fowler's definition, which has gained fairly wide acceptance).
Given that assumption, and not knowing what language you prefer, I was able to find a few Internal DSL's for SQL code generation:
Ruby - sqldsl.rubyforge.org/
Java - code.google.com/p/sql-dsl/
Scala - github.com/p3t0r/scala-sql-dsl
if you google "SQL DSL" there are more, also try googling "SQL DSL [enter your favorite language here]" and you may find something more suitable.
Another approach which has a different set of advantages and disadvantages (than an internal DSL) would be generating the SQL code from a template. Either a template in the form of a string with variable escapes (or concatenation) or in a separate file using a template language.

Why should I use code generators

I have encountered this topic lately and couldn't understand why they are needed.
Can you explain why I should use them in my projects and how they can ease my life.
Examples will be great, and where from I can learn this topic little more.
At least you have framed the question from the correct perspective =)
The usual reasons for using a code generator are given as productivity and consistency because they assume that the solution to a consistent and repetitive problem is to throw more code at it. I would argue that any time you are considering code generation, look at why you are generating code and see if you can solve the problem through other means.
A classic example of this is data access; you could generate 250 classes ( 1 for each table in the schema ) effectively creating a table gateway solution, or you could build something more like a domain model and use NHibernate / ActiveRecord / LightSpeed / [pick your orm] to map a rich domain model onto the database.
While both the hand rolled solution and ORM are effectively code generators, the primary difference is when the code is generated. With the ORM it is an implicit step that happens at run-time and therefore is one-way by it's nature. The hand rolled solution requires and explicit step to generate the code during development and the likelihood that the generated classes will need customising at some point therefore creating problems when you re-generate the code. The explicit step that must happen during development introduces friction into the development process and often leads to code that violates DRY ( although some argue that generated code can never violate DRY ).
Another reason for touting code generation comes from the MDA / MDE world ( Model Driven Architecture / Engineering ). I don't put much stock in this but rather than providing a number of poorly expressed arguments, I'm simply going to co-opt someone elses - http://www.infoq.com/articles/8-reasons-why-MDE-fails.
IMHO code generation is the only solution in an exceedingly narrow set of problems and whenever you are considering it, you should probably take a second look at the real problem you are trying to solve and see if there is a better solution.
One type of code generation that really does enhance productivity is "micro code-generation" where the use of macros and templates allow a developer to generate new code directly in the IDE and tab / type their way through placeholders (eg namespace / classname etc). This sort of code generation is a feature of resharper and I use it heavily every day. The reason that micro-generation benefits where most large scale code generation fails is that the generated code is not tied back to any other resource that must be kept in sync and therefore once the code is generated, it is just like all the other code in the solution.
#John
Moving the creation of "basic classes" from the IDE into xml / dsl is often seen when doing big bang development - a classic example would be developers try to reverse engineer the database into a domain model. Unless the code generator is very well written it simply introduces an additional burden on the developer in that every time they need to update the domain model, they either have to context-switch and update the xml / dsl or they have to extend the domain model and then port those changes back to the xml / dsl ( effectively doing the work twice).
There are some code generators that work very well in this space ( the LightSpeed designer is the only one I can think of atm ) by acting as the engine for a design surface but often
these code generators generate terrible code that cannot be maintained (eg winforms / webforms design surfaces, EF1 design surface) and therefore rapidly undo any productivity benefits gained from using the code generator in the first place.
Well, it's either:
you write 250 classes, all pretty much the same, but slightly different, e.g. to do data access; takes you a week, and it's boring and error-prone and annoying
OR:
you invest 30 minutes into generating a code template, and let a generation engine handle the grunt work in another 30 minutes
So a code generator gives you:
speed
reproducability
a lot less errors
a lot more free time! :-)
Excellent examples:
Linq-to-SQL T4 templates by Damien Guard to generate one separate file per class in your database model, using the best kept Visual Studio 2008 secret - T4 templates
PLINQO - same thing, but for Codesmith's generator
and countless more.....
Anytime you need to produce large amounts of repetetive boilerplate code, the code generator is the guy for the job. Last time I used a code generator was when creating a custom Data Access Layer for a project, where the skeleton for various CRUD actions was created based on an object model. Instead of hand-coding all those classes, I put together a template-driven code generator (using StringTemplate) to make it for me. The advandages of this procedure was:
It was faster (there was a large amount of code to generate)
I could regenerate the code in a whim in case I detected a bug (code can sometimes have bugs in the early versions)
Less error prone; when we had an error in the generated code it was everywhere which means that it was more likely to be found (and, as noted in the previous point, it was easy to fix it and regenerate the code).
Using GUI builders, that will generate code for you is a common practice. Thanks to this you don't need to manually create all widgets. You just drag&drop them and the use generated code. For simple widgets this really saves time (I have used this a lot for wxWidgets).
Really, when you are using almost any programming language, you are using a "code generator" (except for assembly or machine code.) I often write little 200-line scripts that crank out a few thousand lines of C. There is also software you can get which helps generate certain types of code (yacc and lex, for example, are used to generate parsers to create programming languages.)
The key here is to think of your code generator's input as the actual source code, and think of the stuff it spits out as just part of the build process. In which case, you are writing in a higher-level language with fewer actual lines of code to deal with.
For example, here is a very long and tedious file I (didn't) write as part of my work modifying the Quake2-based game engine CRX. It takes the integer values of all #defined constants from two of the headers, and makes them into "cvars" (variables for the in-game console.)
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/game/cvar_constants.c
Here is the short Bash script which generated that code at compile-time:
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/autogen/constant_cvars.sh
Now, which would you rather maintain? They are both equivalent in terms of what they describe, but one is vastly longer and more annoying to deal with.
The canonical example of this is data access, but I have another example. I've worked on a messaging system that communicates over serial port, sockets, etc., and I found I kept having to write classes like this over and over again:
public class FooMessage
{
public FooMessage()
{
}
public FooMessage(int bar, string baz, DateTime blah)
{
this.Bar = bar;
this.Baz = baz;
this.Blah = blah;
}
public void Read(BinaryReader reader)
{
this.Bar = reader.ReadInt32();
this.Baz = Encoding.ASCII.GetString(reader.ReadBytes(30));
this.Blah = new DateTime(reader.ReadInt16(), reader.ReadByte(),
reader.ReadByte());
}
public void Write(BinaryWriter writer)
{
writer.Write(this.Bar);
writer.Write(Encoding.ASCII.GetBytes(
this.Baz.PadRight(30).Substring(0, 30)));
writer.Write((Int16)this.Blah.Year);
writer.Write((byte)this.Blah.Month);
writer.Write((byte)this.Blah.Day);
}
public int Bar { get; set; }
public string Baz { get; set; }
public DateTime Blah { get; set; }
}
Try to imagine, if you will, writing this code for no fewer than 300 different types of messages. The same boring, tedious, error-prone code being written, over and over again. I managed to write about 3 of these before I decided it would be easier for me to just write a code generator, so I did.
I won't post the code-gen code, it's a lot of arcane CodeDom stuff, but the bottom line is that I was able to compact the entire system down to a single XML file:
<Messages>
<Message ID="12345" Name="Foo">
<ByteField Name="Bar"/>
<TextField Name="Baz" Length="30"/>
<DateTimeField Name="Blah" Precision="Day"/>
</Message>
(More messages)
</Messages>
How much easier is this? (Rhetorical question.) I could finally breathe. I even added some bells and whistles so it was able to generate a "proxy", and I could write code like this:
var p = new MyMessagingProtocol(...);
SetFooResult result = p.SetFoo(3, "Hello", DateTime.Today);
In the end I'd say this saved me writing a good 7500 lines of code and turned a 3-week task into a 3-day task (well, plus the couple of days required to write the code-gen).
Conclusion: Code generation is only appropriate for a relatively small number of problems, but when you're able to use one, it will save your sanity.
A code generator is useful if:
The cost of writing and maintaining the code generator is less than the cost of writing and maintaining the repetition that it is replacing.
The consistency gained by using a code generator will reduce errors to a degree that makes it worthwhile.
The extra problem of debugging generated code will not make debugging inefficient enough to outweigh the benefits from 1 and 2.
For domain-driven or multi-tier apps, code generation is a great way to create the initial model or data access layer. It can churn out the 250 entity classes in 30 seconds ( or in my case 750 classes in 5 minutes). This then leaves the programmer to focus on enhancing the model with relationships, business rules or deriving views within MVC.
The key thing here is when I say initial model. If you are relying on the code generation to maintain the code, then the real work is being done in the templates. (As stated by Max E.) And beware of that because there is risk and complexity in maintaining template-based code.
If you just want the data layer to be "automagically created" so you can "make the GUI work in 2 days", then I'd suggest going with a product/toolset which is geared towards the data-driven or two-tier application scenario.
Finally, keep in mind "garbage in=garbage out". If your entire data layer is homogeneous and does not abstract from the database, please please ask yourself why you are bothering to have a data layer at all. (Unless you need to look productive :) )
How 'bout an example of a good use of a code generator?
This uses t4 templates (a code generator built in to visual studio) to generate compressed css from .less files:
http://haacked.com/archive/2009/12/02/t4-template-for-less-css.aspx
Basically, it lets you define variables, real inheritance, and even behavior in your style sheets, and then create normal css from that at compile time.
Everyone talks here about simple code generation, but what about model-driven code generation (like MDSD or DSM)? This helps you move beyond the simple ORM/member accessors/boilerplate generators and into code generation of higher-level concepts for your problem domain.
It's not productive for one-off projects, but even for these, model-driven development introduces additional discipline, better understanding of employed solutions and usually a better evolution path.
Like 3GLs and OOP provided an increase in abstraction by generating large quantities of assembly code based on a higher level specification, model-driven development allows us to again increase the abstraction level, with yet another gain in productivity.
MetaEdit+ from MetaCase (mature) and ABSE from Isomeris (my project, in alpha, info at http://www.abse.info) are two technologies on the forefront of model-driven code generation.
What is needed really is a change in mindset (like OOP required in the 90's)...
I'm actually adding the finishing touches to a code generator I'm using for a project I've been hired on. We have a huge XML files of definitions and in a days worth of work I was able to generate over 500 C# classes. If I want to add functionality to all the classes, say I want to add an attribute to all the properties. I just add it to my code-gen, hit go, and bam! I'm done.
It's really nice, really.
There are many uses for code generation.
Writing code in a familiar language and generating code for a different target language.
GWT - Java -> Javascript
MonoTouch - C# -> Objective-C
Writing code at a higher level of abstraction.
Compilers
Domain Specific Languages
Automating repetitive tasks.
Data Access Layers
Initial Data Models
Ignoring all preconceived notions of code-generation, it is basically translating one representation (usually higher level) to another (usually lower level). Keeping that definition in mind, it is a very powerful tool to have.
The current state of programming languages has by no means reached its full potential and it never will. We will always be abstracting to get to a higher level than where we stand today. Code generation is what gets us there. We can either depend on the language creators to create that abstraction for us, or do it ourselves. Languages today are sophisticated enough to allow anybody to do it easily.
If with code generator you also intend snippets, try the difference between typing ctor + TAB and writing the constructor each time in your classes. Or check how much time you earn using the snippet to create a switch statement related to an enum with many values.
If you're paid by LOC and work for people who don't understand what code generation is, it makes a lot of sense. This is not a joke, by the way - I have worked with more than one programmer who employs this technique for exactly this purpose. Nobody gets paid by LOC formally any more (that I know of, anyway), but programmers are generally expected to be productive, and churning out large volumes of code can make someone look productive.
As an only slightly tangential point, I think this also explains the tendency of some coders to break a single logical unit of code into as many different classes as possible (ever inherit a project with LastName, FirstName and MiddleInitial classes?).
Here's some heresy:
If a task is so stupid that it can be automated at program writing time (i.e. source code can be generated by a script from, let's say XML) then the same can also be done at run-time (i.e. some representation of that XML can be interpreted at run-time) or using some meta-programming. So in essence, the programmer was lazy, did not attempt to solve the real problem but took the easy way out and wrote a code generator. In Java / C#, look at reflection, and in C++ look at templates

Specification Pattern defined in Domain

Using Linq to SQL, and a DDD style Domain Layer with de-coupled repositories, does anyone have any good ideas on how to implement a specification pattern without bleeding L2S concerns up into the domain layer, that is still understandable? :)
We have complex business logic surrounding the selection of a set of transaction data, and would like those rules/specifications to be owned by the Domain. We've also done a good job of keeping our domain persistence ignorant.
This presents a problem, because in order to implement a Specification, the domain (as far as I can tell) needs to see the types being queried (L2S types).
Any ideas?
Also, nHibernate is out of the question for reasons I don't want to explain.. :)
Have you considered mapping your generic Specifications into an Expression tree that would translate into proper L2S syntax? It seems that is the main problem you are hitting here. The Specification pattern isn't the problem, but the mapping to L2S is.
Linq-To-Sql classes can be partial. This means that you can extend them by implementing a partial that implements a common interface. That Interface can be shared between layers without the "bleeding" problem you are describing. The rest is just the details of your "IsStatisfiedBy" which should be easy to encapsulate.
I recently had the same issue. Different pattern, but still LINQ to SQL (L2S). I tried two different ways to avoid the leakage.
First we tried using DTOs and a mapping layer. So we wrote super simple objects that had a one to one mapping to the tables. They were all decorated with L2S attributes. We then wrote a mapping layer to map the DTOs to our business objects. All of this was hidden via the Repository pattern from Doman Driven Design. So consumers of the business objects had no idea the L2S was under the hood.
Next, mostly for variety. We tried using the XML mapping features of L2S so the objects themselves needed no attributes. For collections we exposed IEnumerable instead of any of L2S collections. If you looked at the internals of the business classes you could still detect some usage of L2S (EntitySet or Ref). But consumers of the class had no idea. So some bits of leakage but nothing drastic.
In the end we stuck with the first pattern. The second worked and we could have replaced L2S without changing the interface of the business layer, but I was never happy with XML mapping. The first pattern had a much cleaner separation between the database and the business objects. It took more code. The first one also worked better for us because it allowed us to evolve the business objects differently than the tables. In the early days of the project the xml mapping worked because our objects were pretty much one to one with the tables.
So in the end we put a layer between L2S and the domain. It worked. It took more code, but it was really simple stuff. And it was all very testable.
If you want to avoid referencing Linq2Sql from your domain layer, you must work against interfaces that represent your entities instead of working with the actual entities themselves. You then need a mapping layer between your interfaces and your entities.
I've worked this way and found it to be a severe hindrance. I switched to NHibernate for new projects and for the older projects I simply stopped worrying about the domain referencing Linq2Sql entities directly. Overcoming that restriction is simply too much of a time-cost in my opinion.

Is there a simple Linq to SQL generator with bi-directional serialization attributes?

I'm trying to find a way to generate Linq to SQL classes with bi-directional serialization attributes. Basically I want a DataMember tag (with an appropriate order) on every association property, not just the ones where the class is the primary key (like the Visual Studio generator and SQL Metal do). I checked MyGeneration, but didn't really find anything that worked for me. I thought the T4 Toolbox was going to be my solution, it would be quite easy to modify it to add the attributes, but I get an exception on the calling side of my WCF service, and I've gotten no response back on the issue. I'm about to try installing CodeSmith and using PLINQO, but I'd prefer something free.
I'm pretty close to just writing my own T4 generator, but before I do that, I was hoping to find an pre-built solution to this rather simple problem first.
I ended up writing my own code generator for our L2S classes. We actually generate two sets of classes. One is a "lightweight" set of entities for client application use. These classes have no L2S plumbing. But they have the full datamember attributes with proper order. Then we have our L2S entities, which are strictly for backend use. This has worked out quite well.
Be careful using PLINQO. I've looked at that product extensively. In fact, much of my code generator is based on the code PLINQO generates. However, they have a "major flaw" (their words) in how they have implemented many to many relationships.
You might want to also look at a product named "Reegenerator".
Randy
This turned out to be the solution to my problem. I had just resigned myself to start researching my own generator when I stumbled upon that. It has a bidirectional serialization option and it works great! Here's a link to the author's bog, which contains a great video example of how to get started.