I am actually working on a project that involves the managing of the E-CRFs (electronic case report forms) ay you may know, CRFs are documents used by researchers to collect data by asking several questions and having their answers on paper.
To make a web interface that releaves the reaserchers from paper work.
My question concerns the design of the database. the ECRFs are not static, if they were, I would have to create a simple table in the database and every field would correspond to a question in the CRF. but what I want is to have a database that allows me to create my own CRF with variable number of fields everytime, or even from a same CRF I could have updates in which I add or retrieve a field.
How do I proceed for the database design ??
thank you.
Use a relational database only to store your metadata (e.g. who filled in the form) and store the rest in XML or JSON format in a BLOBS or flat files.
Related
So this is more of a conceptual question. There might be some fundamental concepts which I don't understand clearly so please point out any mistakes in my understanding.
I am tasked with designing a framework and a part of it is I have a MySQL DB and a REST API which acts as the Data Access Layer. Now, the user should be able to parse various data (JSON, CSV, XML, Text, Source Code etc.) and send it to the REST API which persists the data to the DB.
Question 1: Should I specify that all data sent to the REST API should be in JSON format no matter what is parsed? This will ensure (best to my understanding) language independence and gives the REST API a common format to deal with.
Question 2: When it comes to a data model, what should I specify? Is it like a one-model-fits-all sort of thing or is the data model subject to change based on the incoming data?
Question 3: When I think of a relational data model, the thought of foreign keys comes to mind which creates the relation. Now, it might happen that some data may not contain any relation at all. If we think of something like Customer Order sort of data then the relation is easy to identify. But what if the data does not have any relation at all? How does the relational model fit into this?
Any help/suggestion is greatly appreciated. Thank you!
EDIT:
First off, the data can be both structured (say XML) and unstructured (say two text files). I want the DAL to be able to handle and persist whatever data that comes in (that's why I thought of a REST interface in front of the DB).
Secondly, I also just recently thought about MongoDB as an option and was looking into it (I have never used NoSQL DBs before). It kind of makes sense to use it if the incoming data in REST is in JSON. From what I understood I can create a collection in Mongo. Does that make more sense than using a Relational DB??
Finally, as to what I want to do with the data is I have a tool which performs a sort of difference analysis (think git diff) on the data. Say I sent two XML files and the tool retrieves it from the DB and performs the difference analysis and stores the result back in the DB.
Based on these requirements, what would be the optimum way to go about it?
The answer to this will depend on what sort of data it is. Are all these different data types using different notation for the same data? If so then storing in normalised database tables is the way to go. If its just arbitrary strings that happen to have some form of encoding, then its probably best to store in raw.
Again, it depends on what you want to do with it afterwards. Are you analysing the data, and you reporting on it? Are you reading one format and converting to another? Is it all some form of key-value pairs in some notation or other
No way to answer this further without understanding what you are trying to achieve.
I want to build an application that uses data from several endpoints.
Lets say I have:
JSON API for getting cinema data
XML Export for getting data about ???
Another JSON API for something else
A csv-file for some more shit ...
In my application I want to bring all this data together and build views for it and so on ...
MY idea was to set up a database by create schemas for all these data sources, so I can do some kind of "import scripts" which I can call whenever I want to get the latest data.
I thought of schemas because I want to be able to easily adept a new API with any kind of schema.
Please enlighten me of the possibilities and best practices out there (theory and practice if possible :P)
You are totally right on making a database. But the real problem is probably not going to be how to store your data. It's going to be how to make it fit together logically and semantically.
I suggest you first take a good look at what your enpoints can provide. Get several samples from every source and analyze them if you can. How will you know which data is new? How can you match it against existing data and against data from other sources? If existing data changes or gets deleted, how will you detect and handle that? What if sources disagree on something? How and when should you run the synchronization? What will you do if one of your sources goes down? Etc.
It is extremely difficult to make data consistent if your data sources are not. As a rule, if the sources are different, they are not consistent. Thus the proverb "garbage in, garbage out". We, humans, have no problem dealing with small inconsistencies, but algorithms cannot work correctly if there are discrepancies. Even if everything fits together on paper, one usually forgets that data can change over time...
At least that's my experience in such cases.
I'm not sure if in the application you want to display all the data in the same view or if you are going to be creating different views for each of the sources. If you want to display the data in the same view, like a grid, I would recommend using inheritance or an interface depending on your data and needs. I would recommend setting this structure up in the database too using different tables for the different sources and having a parent table related to all them that has a type associated with it.
Here's a good thread with discussion about choosing an interface or inheritance.
Inheritance vs. interface in C#
And here are some examples of representing inheritance in a database.
How can you represent inheritance in a database?
So I have this fileMakerPro7 database. As my senior project, I supposed to migrate the database to a MySQL database and than give it a PHP Based interface in 3N form...
Company allow us $200 tops to spend on the project, but if I pay for something, it has to work. However, I am having trouble finding a way of migrating the database. Any suggestions?
I have found "file maker pro migrator" (http://www.fmpromigrator.com), would the trial version be enough for us? In worst case, we will start from the beginning with throwing away the whole database that company has.
I can also download fileMakerPro12 and use it for a month with trial version for free. Would I be able to convert the db by using FMP12?
I am totally lost...open to any free suggestions...
+this is a non-profit-making company I'm doing the project for
If I had to do it, I'd look at the design of the FileMaker db and create something similar in mysql. Then I would export the Filemaker data to text and import it somehow. The details depend on foreign key values and such.
The PHP interface would be done separately.
MySQL Data Conversion:
Yes, if your database is small enough, the demo version of FmPro Migrator will convert the database and also build you a PHP web application - at no cost.
Here are the limitations of the demo version:
5 fields
5 scripts
5 layouts
PHP Web Application:
Most people don't realize it, but there is a wealth of FileMaker metadata available in XML format for performing these types of conversions. This XML info is available either thru copying the layout via the clipboard or reading it from the Database Design Report XML file. I have found the clipboard data to be the most reliable source of this info.
FmPro Migrator is able to parse in the XML and convert it into the PHP web application.
Each object on a layout is represented in XML, along with style and position info. This info can be used to create form files representing the same look as the original layout. In fact, it can be difficult to see the difference between the web application and the original database if you get all of the object properties implemented. This can be helpful for situations in which companies don't want to have to retrain their employees. They want the web application to look and work the same as the original desktop application.
I have done a few of these conversions recently into the CakePHP framework. Here a few techniques I used:
Auto-Enter Calculation Fields - Stored calculation fields are calculated and stored within the model saves a record to the database.
Unstored Calculation Fields - Unstored Calculation fields are calculated in real-time within the form controller - but only for fields actually displayed on the form. This prevents unnecessarily calculating these values if they aren't being used on a form, improving performance.
Global Fields - A Global field in FileMaker is used like a global variable in programming languages. It is important to know that each FileMaker user gets there own private copy of global field data. There is no equivalent feature MySQL or other SQL database servers, but this functionality can easily be simulated using SESSION variables. Therefore each web user will still get their own private SESSION data, simulating the same functionality originally present in the FileMaker database. I structure these globals in the model data array as if they were retrieved from the model, meaning that converted scripts and fields on forms can reference them easily. Just before the record gets written into the database, the results are saved into SESSION variables for persistence.
Global Variables in Scripts - Global variables within FileMaker scripts match up very well with the use of PHP SESSION variables, if you want to implement the same functionality.
Vector Graphic Objects - FileMaker layouts frequently include rectangles, ovals and line objects. These objects can be replaced with the RafaelJS library, providing high quality resolution independent graphics.
Value Lists - Custom and Field based value lists are implemented in a centralized location within the AppController.php file. Therefore making a change to the definition of the value list within the AppController, succeeds in changing the menu automatically throughout the whole application.
Folks,
I have been tasked with recforing a large system that comprises many javascript based worktools/workflows into something more manageable.
The first part of this is to refactor the data. All data is currently stored as JSON strings in a SQL database. So for each worktool which could be anything from a set of forms to an interactive chart all data is stored as one JSON string in a record associated with the worktool.
Now I have been told that due to the complexity of the object graph behind some of these worktools that it would be fairly futile to look for commonalities that would allow me to model a relational schema from the data. That this schema would either have to be some really generic key value based structure, or that we would need hundreds of tables.
Im wondering now if there is value in using a non relational persistance mechanism, such as NOSQL. I am only looking into this now, but I would really appreciate anyones opinions here who have experience of a similar context of operation or of NOSQL products and process.
Thank you
You can just move your data as is to RavenDB.
It natively understand JSON and you can start querying / working with those documents as first class members.
RavenDB also has a great set of client libraries, which make working with it a breeze.
First a background. Our application is built on ASP.NET MVC3, .NET 4.0, and uses Linq-to-Sql (PLINQO) as its primary means of data access. Our web application is a multi-tenant/multi-client system where each client gets their own Sql Server database. Each Sql Server database up to now has had exactly the same schema.
Often times, clients will ask us to track custom fields in their Db that other clients don't track. The way we've handled this is by reserving a number of customfields in the db in our main tables. For example, our Widget table may have a CustomText1, CustomText2.. CustomText10, and a CustomDate1, CustomDate2..CustomDate10 fields. Again, all our schemas across clients are the same, so Linq-to-Sql handles these fields just as easily as any other field.
Now we are running into an issue where a client wants several hundred CustomBool fields, but doesn't need the others. So, basically, we are researching for ways to still use the Linq-to-Sql, but have it work against potentially different schemas depending on the database it is connected to (although they are different in a very specific way.)
Too much code has already been built on Linq-to-Sql and accessing the Widget classes generated by it that I'd like to not just fall back to straight SQL.
I've seen answers here and on the web on ways for Linq to Sql to access different tables that have the same schema, but I have not found a good answer to the same table name across different dbs with different columns.
Is this possible?
If the main objective is to store a few extra fields for existing domain objects then why not create a generic table that can store key value pairs. This is extremely flexible since there is no need to change your schema if a customer requires a new property.
We do this frequently and normally have some helpers to correctly cast the properties e.g.
Service.GetProperty<bool>("SomeCustomProperty")
If you are looking for a more "pluggable" domain model that can be completely different for each tenant, I think you will struggle if you are following a database driven approach and using the L2S designer to generate your code.
To achieve this you really need to be generating your database based on your code (domain driven design) which will give you much more flexibility i.e. you can load a tenant specific configuration (set of classes, business rules etc.) at runtime and use this to generate/validate your schema.
Update
It would be good if you could elaborate on exactly what design approach you have taken i.e. are you using the Linq designer and generating your model from the database?
It's clear that a generic key value pair store is not going to meet your querying requirements.
It's hard to provide a solution without suggesting a different technology. Relational SQL databases aren't really suited for dynamic domain models. You may be better off with a document database such as MongoDb or RavenDb where you are not tied to a specific schema. You could even make use of these just for your custom properties.
If that's not ideal then another solution would be to use something like Dapper to construct your queries. Assuming you are developing against interfaces you can have a implementation of your data service per tenant that makes use of their custom fields.
Ayende did a whole series of posts on Multitenancy and covers tenant specific domain models. It starts here and may be of some use to you.