Linq 2 Sql and Dynamic table schemas - linq-to-sql

First a background. Our application is built on ASP.NET MVC3, .NET 4.0, and uses Linq-to-Sql (PLINQO) as its primary means of data access. Our web application is a multi-tenant/multi-client system where each client gets their own Sql Server database. Each Sql Server database up to now has had exactly the same schema.
Often times, clients will ask us to track custom fields in their Db that other clients don't track. The way we've handled this is by reserving a number of customfields in the db in our main tables. For example, our Widget table may have a CustomText1, CustomText2.. CustomText10, and a CustomDate1, CustomDate2..CustomDate10 fields. Again, all our schemas across clients are the same, so Linq-to-Sql handles these fields just as easily as any other field.
Now we are running into an issue where a client wants several hundred CustomBool fields, but doesn't need the others. So, basically, we are researching for ways to still use the Linq-to-Sql, but have it work against potentially different schemas depending on the database it is connected to (although they are different in a very specific way.)
Too much code has already been built on Linq-to-Sql and accessing the Widget classes generated by it that I'd like to not just fall back to straight SQL.
I've seen answers here and on the web on ways for Linq to Sql to access different tables that have the same schema, but I have not found a good answer to the same table name across different dbs with different columns.
Is this possible?

If the main objective is to store a few extra fields for existing domain objects then why not create a generic table that can store key value pairs. This is extremely flexible since there is no need to change your schema if a customer requires a new property.
We do this frequently and normally have some helpers to correctly cast the properties e.g.
Service.GetProperty<bool>("SomeCustomProperty")
If you are looking for a more "pluggable" domain model that can be completely different for each tenant, I think you will struggle if you are following a database driven approach and using the L2S designer to generate your code.
To achieve this you really need to be generating your database based on your code (domain driven design) which will give you much more flexibility i.e. you can load a tenant specific configuration (set of classes, business rules etc.) at runtime and use this to generate/validate your schema.
Update
It would be good if you could elaborate on exactly what design approach you have taken i.e. are you using the Linq designer and generating your model from the database?
It's clear that a generic key value pair store is not going to meet your querying requirements.
It's hard to provide a solution without suggesting a different technology. Relational SQL databases aren't really suited for dynamic domain models. You may be better off with a document database such as MongoDb or RavenDb where you are not tied to a specific schema. You could even make use of these just for your custom properties.
If that's not ideal then another solution would be to use something like Dapper to construct your queries. Assuming you are developing against interfaces you can have a implementation of your data service per tenant that makes use of their custom fields.
Ayende did a whole series of posts on Multitenancy and covers tenant specific domain models. It starts here and may be of some use to you.

Related

Is it possible to create a SQL data base using a core data schema

I have a core data schema file with relationships between the entities.
I need to create a SQL database and would like to know if it can be created automatically (MySql or MS-SQL) using only this file.
Looking at the SQLite DB I see that the relationships are not mapped in any logical way.
First, your assessment that the relationships are "not mapped in any logical way" is not correct. If you look carefully at the Core Data generated database you will discover that the relationships are mapped exactly as in any other old relational database scheme, i.e. with foreign keys referring to rows in other tables.
Also, the naming conventions in these SQLite databases are very transparent (e.g., entity and attribute names start with Z, etc.
That being said, I would strongly discourage you to hack the Core Data generated database file, or even to use it to inform another database scheme, the reason being that these are undocumented features that could change any time without notice and thus break any code you write based on them.
IMO, the most practical thing to do is to rewrite the model quickly in the usual MySQL schema format and update it manually as well when you change the managed object model.
If you would like to automate the process, there is a rich set of APIs provided for interpreting and parsing NSManagedObjectModel, including classes like NSEntityDescription, NSAttributeDescription etc. You could write a framework that iterates though your entities and attributes and generates a text file that is a readable schema for MySQL, complete with information about indexing, versions etc..
If you go down that route, please make sure to notify us and do post your framework on Github for the benefit of others.
If you use Core Data you can create an SQL based database using a schema file but its structure is entirely controlled by the Core Data framework. Apple specifically tell us as developers to leave it alone and do not edit it using libsqlite or any other method. If you do then Core Data won't have anything to do with you!
In terms of making your own DB using one of Apple's schema files, I'm sure it is possible, but you'd have to know the inner workings of the Core Data framework to even attempt it.
In terms of making your own SQLite DB then you have to sort out all the relationships and mapping yourself.
I think that mixing and matching Core Data resources and custom built SQLite databases is probably a headache waiting to happen. I have used both methods and find that Core Data is brilliant (especially with iCloud) as long as you're OK with your App being limited to Apple only.

Multiple Linq data models with the same table being mapped in each Re-use mapping

I've implemented the repository pattern on the data access layer of our current service layer.
We have an object model where the same class "historical notes" is mapped on mutiple objects (currently 6 but soon to be more!)
Part of the best practices for the use of linq to sql is not to have one dbml file for every table in the db, but instead to break it down, this way it doesn't have a huge performance hit when the context is created.
Unfortunately the logical places to separate the objects leaves the historical notes in 5 different DBML files. When the linq generator creates the classes it generates a different class in the different namespace.
I have a historical note object in the domain model, but I don't want to re-map the domain object model into the data model for every time we use the historical notes.
One of the things I don't want to do is break the "reading" of the data into multiple queries.
Is there a way I can map the historical note into multiple data models but only write the mapping once?
Thanks
Pete
Solution
Thanks for the help, I think I'm going to move back to one data context for all the data tables.
The work arounds involved in setting up the multiple models isn't worth the extra complexity and potential fragility of the code. Having to write the same left hand, right hand code to map the historical notes is all too much work and too many places to keep the code in sync.
Thanks guys for the input
Part of the best practices for the use
of linq to sql is not to have one dbml
file for every table in the db, but
instead to break it down, this way it
doesn't have a huge performance hit
when the context is created.
Where did you hear that? I don't agree. The DataContext is generally a fairly lightweight object, regardless of the number of tables.
See here for an analysis of the issues involving multiple data contexts:
LINQ to SQL: Single Data Context or Multiple Data Contexts?
http://craftycodeblog.com/2010/07/19/linq-to-sql-single-data-context-or-multiple/
In my opinion, you should have one datacontext per database. This would also solve your mapping problems.
See also LINQ to SQL: Multiple / Single .dbml per project?
One option could be to put the historical notes in their own datacontext, and keep the relationships between this object and the rest of your model as 'ids' (so just foreign keys in the db). That's how I would do it anyway.

Best way to create a SPARQL endpoint for a RDBMS (MySQL database)

I am doing (want to do) some experiments with Linked Open Datasets particularly those put out by governments.
I have a RDBMS (more specifically MySQL). I designed it with semantic web ideas in mind i.e. I have a information stored as objects, predicates and classes which define objects. In turn all objects are related to each other though statements of the form subject --> predicate --> object (where the subjects are from the objects table).
I want to be able to query other RDF triple stores from my application and let other triple stores query my data. Is it possible to "set something up" so that this is possible?
I have looked at Jena. Using Jena seems to mean I have to it as a storage application rather than MySQL - the only problem with this is that I include a new concept called a category (which I don't think is part of the semantic web languages). I will use categories to help with displaying information (they don't have any other meaning) but using Jena seems to mean that I can't organise predicates under categories for more convenient viewing.
I am using Java so a JAVA API is preferred.
It's also possible I misunderstood the purpose of Jena, and maybe that can be of use, but I am not sure how.
I am sure four days from now this question will seem rather silly, but at the moment I am somewhat confused about how to proceed.
I'm not sure what you mean by "a new concept called category", perhaps you can give an example?
If you mean that you want to add additional metadata, perhaps as a way of organizing information in the user interface, there is no need to extend the semantic web languages or storage systems - they can already do what you want.
Suppose you have data for a school from the UK Government schools dataset (using Turtle encoding for brevity):
#prefix sch-ont: <http://education.data.gov.uk/def/school/>.
<http://education.data.gov.uk/id/school/135412>
a sch-ont:School;
sch-ont:establishmentStatus
<http://education.data.gov.uk/def/school/EstablishmentStatus_Open>;
sch-ont:MSOA <http://statistics.data.gov.uk/id/msoa/E02000001>;
sch-ont:establishmentName "Guildhall School of Music and Drama";
...
You can directly query that data from the SPARQL end-point, or you can download the data and store it locally in your own triple store. Either way, you're perfectly at liberty to add extra information that's useful to your users. For example:
#prefix ankurs-app: <http://ankur.org/example/app/vocab/display#>.
<http://education.data.gov.uk/id/school/135412>
ankurs-app:category ankurs-app:wkdCool.
You can store this new triple in the same graph as the downloaded data, or you can store it in a separate named-graph to indicate that it's information that has a different provenance than the source data. Either way, it's then simple to query it either programmatically from Jena, or via a SPARQL query.
Doing a layout for efficiently querying schemaless triple-centric data is a well-studied, and hard, problem. Most of the RDF platforms, including Jena, have well-optimised code for querying and updating triples from their own database schemes. You would have to have very good reasons for embarking on your own relational table layout :)
If you really do need to take an existing relational table scheme and map it to a Jena RDF model, look at D2RQ.
Why didn't you just use a triple store to store all of your data? If you use a triple store with SPARQL endpoint capability then you would have a SPARQL-accessible web api. Similarly, many other data sets on the web are exposed as SPARQL endpoints and accessible via HTTP.
There are many triple stores available with persistent storage both in a db and otherwise (Jena + SDB, Mulgara, Virtuoso, Oracle, etc). You could certainly extend Mulgara through their resolvers to support queries against your custom db but I think that's probably a lot of work for not too much real value.
I'm sure you could use existing concepts to handle your notion of categories in RDF or perhaps by layering something over Jena.

repository pattern with a legacy database and Linq to SQL

I'm building an application on top of a legacy database (which I cannot change). I'm using Linq to SQL for the data access, which means I have a (Linq to SQL) class for each table.
My domain model does not match with the database. For example, there are two tables named Users and Employees, and therefore I have two Linq to SQL classes named User and Employee. But in my domain model I'd like to have a User class which should contain some fields from either table (but I don't care about a lot of the other fields of these tables).
I'm not sure how I should design my repositories:
should the repositories perform the mapping between Linq to SQL classes (e.g. User, Employee) to the domain classes (User) and only return the domain classes to the application
or should my repositories return the Linq to SQL classes and leave the mapping to the caller
The first approach seems to make more sense to me, but is this the correct way to implement my repositories?
The purist (I try to stay pure) will tell you that your model represents your data. And therefore, anything that needs to be persisted is done so only when needed through repositories. Also, when you have complex entities, you want to use a service to combine them. For example, User + Employee = UserEmployee entity that is only accessible through an IUserEmployeeService.
With those vague statements, you have an excellent opportunity here.
Build an anti-corruption layer, which allows you to start moving off of the legacy DB at the same time.
This is an another chapter in the DDD playbook. An Anti-Corruption layer is used to interface with a legacy system using Facades, Translators, and Adapters to isolate the legacy DB with your pure Domain model.
Now, this may be a lot more work than you wanted. So, you have to ask yourself at this point:
Do I want to start the process of
moving off of this legacy DB, or will
it remain for the life of the
application?
If your answer is you can start migrating, then model your actual domain the way you want it. Persist it with normal repositories and services. Have fun designing it the way YOU want it stored. Then, use the services of the aggregate roots to reach into the anti-corruption layer and pull out the entities, store/update them locally, and translate into your domain's entities.
If the answer is that the legacy DB will remain for the life of the project, then your task is much easier. Use your domain's services (e.g. UserEmployeeService) to reach into the anti-corruption's UserFacade and EmployeeFacade (similar to a "Remote Service" concept).
Within the Facades, access the legacy db using the Adapters (e.g. LegacyDbSqlDatabase) to get a raw legacyUser(). The next step would be to use an UserTranslator() and EmployeeTranslator() mapper that converts the legacy user data into your actual domain's version of the User() entity, and return it from the UserFacade back to your UserEmployeeService, where it is combined with the Employee entity that came from the same place.
Whoa, that was a lot of typing...
With your Adapters and Facades of your Anti-Corruption layer, you can do your Linq-to-Sql or whatever you want to do. It doesn't matter because you have completely isolated the legacy DB/system away from your nice and pure Domain - your domain that has its own version of User() and Employee() entities and value objects.
DDD and Linq To SQL don't go together very well because the generated classes are not meant to deviate significantly from your DB table structure. You'll have to either map your classes in a way that makes working with Linq to SQL a pain or just live with a non-ideal object model.
If you really want to utilize DDD and the repository pattern go for Entity Framework or even better NHibernate.

What is the best way to build a data layer across multiple databases?

First a bit about the environment:
We use a program called Clearview to manage service relationships with our customers, including call center and field service work. In order to better support clients and our field technicians we also developed a web site to provide access to the service records in Clearview and reporting. Over time our need to customize the behavior and add new features led to more and more things being tied to this website and it's database.
At this point we're dealing with things like a Company being defined partly in the Clearview database and partly in the website database. For good measure we're also starting to tie the scripting for our phone system into the same website, which will require talking to the phone system's own database as well.
All of this is set up and working... BUT we don't have a good data layer to work with it all. We moved to Linq to SQL and now have two DBMLs that we can use, along with some custom classes I wrote before I'd ever heard of Linq, along with some of the old style ADO datasets. So yeah, basically things are a mess.
What I want is a data layer that provides a single front end for our applications, and on the back end manages everything into the correct database.
I had heard something about Entity Framework allowing classes to be built from multiple sources, but it turns out there can only be one database. So the question is, how could I proceed with this?
I'm currently thinking of getting the Linq To SQL classes all set for each database, then manually writing Linq compatible front ends that tie those together. Seems like a lot of work, and given Linq's limitations (such as not being able to refresh) I'm not sure it's a good idea.
Could I do something with Entity Framework that would turn out better? Should I look into another tool? Am I crazy?
The Entity Framework does give a certain measure of database independence, insofar as you can build an entity model from one database, and then connect it to a different database by using a different entity connect string. However, as you say, it's still just one database, and, moreover, it's limited to databases which support the Entity Framework. Many do, but not all of them. You could use multiple entity models within a single application in order to combine multiple databases using the Entity Framework. There is some information on this on the ADO.NET team blog. However, the Entity Framework support for doing this is, at best, in an early stage.
My approach to this problem is to abstract my use of the Entity Framework behind the Repository pattern. The most immediate benefit of this, for me, is to make unit testing very simple; instead of trying to mock my Entity model, I simply substitute a mock repository which returns IQueryables. But the same pattern is also really good for combining multiple data sources, or data sources for which there is no Entity Framework provider, such as a non-data-services-aware Web service.
So I'm not going to say, "Don't use the Entity Framework." I like it, and use it, myself. In view of recent news from Microsoft, I believe it is a better choice than LINQ to SQL. But it will not, by itself, solve the problem you describe. Use the Repository pattern.
if you want to use tools like Linq2SQl or EF and don't want to have to manage multiple DBMLS (or whaetever its called in EF or other tools), you could create views in your website database, that reference back to the ClearView or Phone system's DB.
This allows you to decouple your web site from their database structure. I believe Linq2Sql and EF can use a view as the source for an Entity. If they can't look at nHibernate.
This will also let you have composite entities that are pulled from the various data sources. There are some limitations updating views in SQL Server; however, you can define your own Instead of trigger(s) on the view which can then do the actual insert update delete statements.
L2S works with views, perfectly, in my project. You only need to make a small trick:
1. Add a secondary DB table to the current DB as a view.
2. In Designer, add a primary key attribute to a id field on the view.
3. Only now, add an association to whatever other table you want in the original DB.
Now, you might see the view available for the navigation.