One codebase, two clients, two versions of a Doctrine ORM entity - mysql

I have an app that collects data. It's a survey of sorts. The questions for the survey can be managed by a GUI tied to database tables in the app. But the actual answers to the questions get stored in a single table: observations. I've considered an EAV model instead, but let's set that aside for the moment. The Observation entity has over 900 properties because the survey has around that many questions. This has worked ok so far, even if it is a bit ugly in spots. But now I'm working on making this app power a new survey from a new client. It's key that I maintain the same codebase and the same git repository, but the app needs to accommodate another 700 observation properties. I added them to my entity and attempted to do a migration to create the new database columns. But alas, I hit an error telling me that the row size is too large. Too many columns!
The workaround I'd like to explore is to have multiple versions of the Observation entity. I could have one for each survey and use a config file to select the right one. But I want the selected entity to sit in the same spot in the ORM hierarchy. So, for example. If I call
$subscription->getObservation()
I want it to return the right kind of
observation based on the config. It's ok if each install ends up having a table for each survey because all but one of those tables would have 0 rows.
As mentioned above, another option would be to abandon the wide-table design and use EAV. But that approach has some major downsides.

Related

How should I store similar entities - in one table or several?

I am creating a CV website, but in difference to most I am trying to make it with database. I mean that usually such websites are static and all of the information is hard coded in the HTML. Since I am back-end developer I like to make it so everything including buttons and welcome messages are taken from the database. I am trying to store projects that I have worked on. There are several types:
Github Repository - a project that is done purely on github.
Work related - a project I have done on work and there is no github repository of it, only link to view the final result
UpWork or other freelance website - as a freelancer I have projects to fix something on a website and those projects can be viewed only on my profile there and I would like to list them with link to UpWork or wherever there is information on what exactly I was hired to do.
Now my question is - should I have different Entities and therefore different tables for these types of projects or should I have all of the possible properties in one table. For example if it is Github there is repository field and if it is work related then there is company field. If it is freelance it has link to the website I was hired on. Also there are different sub-types - web applications, desktop applications, games and so on.
As you can guess the changes are small (1 or 2 properties). I could very easily leave empty some properties and have another property projectType, but is this the right way? Should I have different tables and entities for them?
To give some info - I can work with both MySQL and NoSQL and I havent decided yet on which one should my website be made on. I am currently thinking about NoSQL. This means I am asking on how to store the projects on MySQL and NoSQL (by NoSQL I mean MongoDB). If it helps the languages I am choosing from are PHP (MySQL) and JavaScript (NoSQL)
I know that usually questions without code are downvoted, but this is more of a logic based problem as I know how to do it, but I don't know the best practices for my situation. This being said here is a small code for you -
console.log('Thank you in advance')
MongoDB lends itself very well to this exact situation.
You can create a collection where documents leave out certain fields if they are not needed for that type. The querying parameters of MongoDB allow you to check $exists on fields if you need to, and documents are stored efficiently, only taking up memory where a field is needed.
You can even setup a sparse index which is not required for every document. As long as your core document structure is the same, it is a good idea to keep them in one collection, and vary them based on their type.

Creating a CakePHP REST api from an existing project

I have a webapp which I am planning on converting into a REST api and have decided to use CakePHP for this - the current form is written in ColdFusion.
The database is a couple million records in size with 20 tables or so and a few associative tables that handle the many-to-many situations.
I'm looking for the best method to start the CakePHP solution mainly in regards to the database. Should I import my existing db and just use cake to access its current form? Should I bake a fresh database structure (in order to stay within the cake standards) then figure out how to get my data into the new db, and maintain relationships etc (how?)?
Edit:
There are many users on the existing app, but when the new CakePHP api is setup and ready to go, the old service will be closed to use the new one.
The current app is not designed in an MVC way, are you referring to Models as being synonymous with Tables? There are many existing tables with foreign key relationships but they are not named using the CakePHP standards - so not sure if this will break CakePHP or make its features not as usable.
Time is an issue, but I'd rather take the time now and get it done the correct way, instead of having to re-visit shortly.
I largely depends on your situation:
Are there people using the old application? - This will mean you
can't really create another database for your new app if you want to
access new information.
Is there really a need to change the relationships of models? - I don't think you should change unless you
really need to.
Cost - How much time are you willing to spend on the migration?
Note:
You can modify almost everything on your model to cope up with the previous database/table structure.

Linq 2 Sql and Dynamic table schemas

First a background. Our application is built on ASP.NET MVC3, .NET 4.0, and uses Linq-to-Sql (PLINQO) as its primary means of data access. Our web application is a multi-tenant/multi-client system where each client gets their own Sql Server database. Each Sql Server database up to now has had exactly the same schema.
Often times, clients will ask us to track custom fields in their Db that other clients don't track. The way we've handled this is by reserving a number of customfields in the db in our main tables. For example, our Widget table may have a CustomText1, CustomText2.. CustomText10, and a CustomDate1, CustomDate2..CustomDate10 fields. Again, all our schemas across clients are the same, so Linq-to-Sql handles these fields just as easily as any other field.
Now we are running into an issue where a client wants several hundred CustomBool fields, but doesn't need the others. So, basically, we are researching for ways to still use the Linq-to-Sql, but have it work against potentially different schemas depending on the database it is connected to (although they are different in a very specific way.)
Too much code has already been built on Linq-to-Sql and accessing the Widget classes generated by it that I'd like to not just fall back to straight SQL.
I've seen answers here and on the web on ways for Linq to Sql to access different tables that have the same schema, but I have not found a good answer to the same table name across different dbs with different columns.
Is this possible?
If the main objective is to store a few extra fields for existing domain objects then why not create a generic table that can store key value pairs. This is extremely flexible since there is no need to change your schema if a customer requires a new property.
We do this frequently and normally have some helpers to correctly cast the properties e.g.
Service.GetProperty<bool>("SomeCustomProperty")
If you are looking for a more "pluggable" domain model that can be completely different for each tenant, I think you will struggle if you are following a database driven approach and using the L2S designer to generate your code.
To achieve this you really need to be generating your database based on your code (domain driven design) which will give you much more flexibility i.e. you can load a tenant specific configuration (set of classes, business rules etc.) at runtime and use this to generate/validate your schema.
Update
It would be good if you could elaborate on exactly what design approach you have taken i.e. are you using the Linq designer and generating your model from the database?
It's clear that a generic key value pair store is not going to meet your querying requirements.
It's hard to provide a solution without suggesting a different technology. Relational SQL databases aren't really suited for dynamic domain models. You may be better off with a document database such as MongoDb or RavenDb where you are not tied to a specific schema. You could even make use of these just for your custom properties.
If that's not ideal then another solution would be to use something like Dapper to construct your queries. Assuming you are developing against interfaces you can have a implementation of your data service per tenant that makes use of their custom fields.
Ayende did a whole series of posts on Multitenancy and covers tenant specific domain models. It starts here and may be of some use to you.

Multiple Linq data models with the same table being mapped in each Re-use mapping

I've implemented the repository pattern on the data access layer of our current service layer.
We have an object model where the same class "historical notes" is mapped on mutiple objects (currently 6 but soon to be more!)
Part of the best practices for the use of linq to sql is not to have one dbml file for every table in the db, but instead to break it down, this way it doesn't have a huge performance hit when the context is created.
Unfortunately the logical places to separate the objects leaves the historical notes in 5 different DBML files. When the linq generator creates the classes it generates a different class in the different namespace.
I have a historical note object in the domain model, but I don't want to re-map the domain object model into the data model for every time we use the historical notes.
One of the things I don't want to do is break the "reading" of the data into multiple queries.
Is there a way I can map the historical note into multiple data models but only write the mapping once?
Thanks
Pete
Solution
Thanks for the help, I think I'm going to move back to one data context for all the data tables.
The work arounds involved in setting up the multiple models isn't worth the extra complexity and potential fragility of the code. Having to write the same left hand, right hand code to map the historical notes is all too much work and too many places to keep the code in sync.
Thanks guys for the input
Part of the best practices for the use
of linq to sql is not to have one dbml
file for every table in the db, but
instead to break it down, this way it
doesn't have a huge performance hit
when the context is created.
Where did you hear that? I don't agree. The DataContext is generally a fairly lightweight object, regardless of the number of tables.
See here for an analysis of the issues involving multiple data contexts:
LINQ to SQL: Single Data Context or Multiple Data Contexts?
http://craftycodeblog.com/2010/07/19/linq-to-sql-single-data-context-or-multiple/
In my opinion, you should have one datacontext per database. This would also solve your mapping problems.
See also LINQ to SQL: Multiple / Single .dbml per project?
One option could be to put the historical notes in their own datacontext, and keep the relationships between this object and the rest of your model as 'ids' (so just foreign keys in the db). That's how I would do it anyway.

What is the best way to build a data layer across multiple databases?

First a bit about the environment:
We use a program called Clearview to manage service relationships with our customers, including call center and field service work. In order to better support clients and our field technicians we also developed a web site to provide access to the service records in Clearview and reporting. Over time our need to customize the behavior and add new features led to more and more things being tied to this website and it's database.
At this point we're dealing with things like a Company being defined partly in the Clearview database and partly in the website database. For good measure we're also starting to tie the scripting for our phone system into the same website, which will require talking to the phone system's own database as well.
All of this is set up and working... BUT we don't have a good data layer to work with it all. We moved to Linq to SQL and now have two DBMLs that we can use, along with some custom classes I wrote before I'd ever heard of Linq, along with some of the old style ADO datasets. So yeah, basically things are a mess.
What I want is a data layer that provides a single front end for our applications, and on the back end manages everything into the correct database.
I had heard something about Entity Framework allowing classes to be built from multiple sources, but it turns out there can only be one database. So the question is, how could I proceed with this?
I'm currently thinking of getting the Linq To SQL classes all set for each database, then manually writing Linq compatible front ends that tie those together. Seems like a lot of work, and given Linq's limitations (such as not being able to refresh) I'm not sure it's a good idea.
Could I do something with Entity Framework that would turn out better? Should I look into another tool? Am I crazy?
The Entity Framework does give a certain measure of database independence, insofar as you can build an entity model from one database, and then connect it to a different database by using a different entity connect string. However, as you say, it's still just one database, and, moreover, it's limited to databases which support the Entity Framework. Many do, but not all of them. You could use multiple entity models within a single application in order to combine multiple databases using the Entity Framework. There is some information on this on the ADO.NET team blog. However, the Entity Framework support for doing this is, at best, in an early stage.
My approach to this problem is to abstract my use of the Entity Framework behind the Repository pattern. The most immediate benefit of this, for me, is to make unit testing very simple; instead of trying to mock my Entity model, I simply substitute a mock repository which returns IQueryables. But the same pattern is also really good for combining multiple data sources, or data sources for which there is no Entity Framework provider, such as a non-data-services-aware Web service.
So I'm not going to say, "Don't use the Entity Framework." I like it, and use it, myself. In view of recent news from Microsoft, I believe it is a better choice than LINQ to SQL. But it will not, by itself, solve the problem you describe. Use the Repository pattern.
if you want to use tools like Linq2SQl or EF and don't want to have to manage multiple DBMLS (or whaetever its called in EF or other tools), you could create views in your website database, that reference back to the ClearView or Phone system's DB.
This allows you to decouple your web site from their database structure. I believe Linq2Sql and EF can use a view as the source for an Entity. If they can't look at nHibernate.
This will also let you have composite entities that are pulled from the various data sources. There are some limitations updating views in SQL Server; however, you can define your own Instead of trigger(s) on the view which can then do the actual insert update delete statements.
L2S works with views, perfectly, in my project. You only need to make a small trick:
1. Add a secondary DB table to the current DB as a view.
2. In Designer, add a primary key attribute to a id field on the view.
3. Only now, add an association to whatever other table you want in the original DB.
Now, you might see the view available for the navigation.