We have about about six systems (they are all internal systems) that we need to send data between. Currently we do not have a consistent way of doing this. We use SSIS, SQL Server linked servers to directly update databases, ODBC connections to directly update databases, text files, etc..
Our goals are:
1) Have a consistent way of connecting applications.
2) Have a central way of monitoring and logging the connections between
applications.
3) For the applications that offer web services we
would like to start using them instead of connectiong directly with
the database.
Whatever we use will need to be able to connect to web services, databases, flat files, and should also be able to accept data via a tcp connection.
Is Biztalk a good solution for this, or is it is overkill?
It really depends. For the architecture you're describing, it would seem a good fit. However, you will need to validate wether biztalk can communicate whith the systems you are trying to integrate. For example; when these systems use webservices, message queues or file based communication, that may be a good fit.
When you start with biztalk, you have to be willing to invest in hardware, software, en most of all in learning to use it.
regarding your points:
1) yes, if you make sure to encapsulate the system connectors correctly
2) yes, biztalk supports this with BAM
3) yes, that would match perfectly
From what you've described (6 systems), it is definitely a good time to investigate a more formalized approach to integration, as you've no doubt found that in a point to point / direct integration approach will result in a large number of permutations / spaghetti as each new system is added.
BizTalk supports both hub and spoke, and bus type topologies (with the ESB toolkit), either of which will reduce the number of interconnects between your systems.
To add to oɔɯǝɹ:
Yes - ultimately BizTalk converts everything to XML internally and you will use either visual maps or xslt to transform between message types.
Yes. Out of the box there are a lot of WMI and Perfmon counters you can use, plus BizTalk has a SCOM management pack to monitor BizTalk's Health. For you apps, BAM (either TPE for simple monitoring, but more advanced stuff can be done with the BAM API).
Yes - BizTalk supports all the common WCF binding types, and basic SOAP web services. BizTalk's messagebox can be used as a pub / sub engine which can allow you to 'hook' other processes into messages at a later stage.
Some caveats:
. BizTalk should be used for messages (e.g. Electronic Documents across the organisation), but not for bulk data synchronisation. SSIS is a better bet for really large data transfers / data migration / data synchronisation patterns.
. As David points out, there is a steep learning curve to BizTalk and the tool itself isn't free (requiring SQL and BizTalk licenses, and usually you will want to use a monitoring tool like SCOM as well.). To fast track this, you would need to send devs on BizTalk training, or bring in a BizTalk consultant.
. Microsoft seem to be focusing on Azure Service Bus, and there is speculation that BizTalk is going merged into Azure Service Bus at some point in future. If your enterprise strategy isn't entirely Microsoft, you might also want to consider products like NServiceBus and FUSE for an ESB.
You problem is a typical enterprise problem. Companies start of building isolated applications like HR, Web, Supply Chain, Inventory, Client management etc over number of years and once they reach a point these application cannot be living alone and they need to talk to each other, typically they start some hacked solution like data migration at database level.
But very soon they realize the problems like no clear visibility, poor management, no standards etc and they create a real spaghetti. The biggest threat is applications will become dependant on one another and you lose your agility to change anything. Any change to system will require heavy testing and long release cycle.
This is the kind of problem a middleware platform like BizTalk Server will solve for you. Lot of replies in the thread focused on cost of BizTalk server (some of the cost mentioned are not correct BTW). It's not a cheap product, but if you look at the role it play in your organisation as a central middleware platform connecting all the applications together and number of non-functional benefits you get out of the box like adapters to most of the third party products like SAP, Oracle, FTP, FILE, Web Services, etc, ability to scale your platform easily, performance, long running work flows, durability, compensation logic for long running workflows, throttling your environment etc., soon the cost factor will diminish.
My recommendation will be take a look at BizTalk, if you are new then engage with local Microsoft office. Either they can help or recommend a parter who can come and analyse your situation.
Related
I'm looking for a way to implement basic Publish / Subscribe between applications written in different languages, to exchange events with JSON payloads.
WebSocket seems like the obvious choice for the transport, but you need an (arguably small) layer on top to implement some of the plumbing:
aggreeing on messages representing the pubsub domain "subscribe to a topic", "publish a message"
aggreeing on messages for the infra ("heartbeat", "authentication")
I was expecting to find an obvious standard for this, but there does not seem to be any.
WAMP is often refered to, but in my (short) experience, the implementations of server / clients libraries are not great
STOMP is often refered to, but in my (even shorter) experience, it's even worse
Phoenix Channels are nice, but they're restricted to Phoenix/Elixir world, and not standard (so the messages can be changed at any phoenix version without notice.)
So, is everyone using MQTT/WS (which require another broker components, rather than simple servers ?) Or gRPC ?
Is everyone just re-implementing it from scratch ? (It's one of those things that seems easy enough to do oneselves, but I guess you just end up with an half-baked, poorly-specified, broken version of the thing I'm looking for...)
Or is there something fundamentally broken with the idea of serving streams of data from a server over WS ?
There are two primary classes of WebSocket libraries; those that implement the protocol and leave the rest to the developer, and those that build on top of the protocol with various additional features commonly required by realtime messaging applications, such as restoring lost connections, pub/sub, and channels, authentication, authorization, etc.
The latter variety often requires that their own libraries be used on the client-side, rather than just using the raw WebSocket API provided by the browser. As such, it becomes crucial to make sure you’re happy with how they work and what they’re offering. You may find yourself locked into your chosen solution’s way of doing things once it has been integrated into your architecture, and any issues with reliability, performance, and extensibility may come back to bite you.
ws, faye-websockets, socket.io, μWebSockets and SocketCluster are some good open-source options.
The number of concurrent connections a server can handle is rarely the bottleneck when it comes to server load. Most decent WebSocket servers can support thousands of concurrent connections, but what’s the workload required to process and respond to messages once the WebSocket server process has handled receipt of the actual data?
Typically there will be all kinds of potential concerns, such as reading and writing to and from a database, integration with a game server, allocation and management of resources for each client, and so forth.
As soon as one machine is unable to cope with the workload, you’ll need to start adding additional servers, which means now you’ll need to start thinking about load-balancing, synchronization of messages among clients connected to different servers, generalized access to client state irrespective of connection lifespan or the specific server that the client is connected to – the list goes on and on.
There’s a lot involved when implementing support for the WebSocket protocol, not just in terms of client and server implementation details, but also with respect to support for other transports to ensure robust support for different client environments, as well as broader concerns, such as authentication and authorization, guaranteed message delivery, reliable message ordering, historical message retention, and so forth. A data stream network such as Ably Realtime would be a good option to use in such cases if you'd rather avoid re-inventing the wheel.
There's a nice piece on WebSockets, Pub/Sub, and all issues related to scaling that I'd recommend reading.
Full disclosure: I'm a Developer Advocate for Ably but I hope this genuinely answers your question.
In my company we develop internal products and usually this products must be integrated (typical scenario). Our solution is always use SOA web services for this kind of task.
This is a solution that forces different products to develop same UI interfaces, and the responsibility of this interfaces or process is sometimes not clear between our teams.
I want to propose an alternative solution, I think that sometimes each product publish SOA web services but in addition publish web-pages, that other products can call to reuse.
for propose it in the work, I want document my-self because I suppose that this kind of integration is already invented, and have a name, examples, tools, best-practices...
any orientation?
To me it looks like a mistake to share entire UIs across applications. This will violate the separation of concerns principle, and will couple applications to each other. Also, you will end up losing autonomy because an external application will control how your UI looks. You will have all kinds of issues, related to the inter-dependencies between applications. These are the first that come to mind:
I assume the information is not public. If so, you'll need single sign-on between applications. In order to implement some level of security (at the very least authentication) you'll need to integrate all your applications within a single sign on-solution or invent some kind of convention for passing tokens between applications.
It will be very hard to implement even primitive authorization if needed, because naturally roles in different systems vary, especially across business domains.
If you have to filter any data by some property, it will have to be transferred in the request to the central application that holds the GUI prior to displaying. The whole processing of data and preparation for visualization will then be responsibility of this central UI system. This will lead to insane amount of caller-specific rules for each caller that uses the central UI.
Any complex UI required by applications will still have to be implemented locally, so you only solve the issue partially.
Here is what I would do in your case if I really must implement central UI, although I advice strongly against it and would prefer sticking to SOA with some kind of middle ware (ESB or even more primitive EAI software that limits point-to-point integration). You can decide for yourself if it is worth the time and effort:
I would only offer a very limited version of UI that is 100% based on the service layer that already exists and is generated automatically. This means - if you change the service the UI accommodates the new data and displays it automatically. If an application needs complex visualization it must still use the Service layer to retrieve data and implement custom UI locally. In genera, it will be better if UI and visualization are controlled by each application separately. This is much more agile and future-proof.
Hope this helps!
What problem do MOM (Message Oriented Middleware) solve? Scalability? Integration?
In which domain are they typically used and in which domains are they typically not used?
For example, say, is Google using such solution for it's main search engine or to power GMail?
What about big websites like Walmart, eBay, FedEx (pretty much a Java shop) and buy.com (pretty much an MS shop)? Does MOM solve a need there?
Does it make any sense when you're writing a Webapp where you control the server-side and have an homogenous environment (say tens of Amazon EC2 instances all running Linux + Java JVMs) there and where the clients are, well, Web browsers?
Does it make sense for desktop apps that need to communicate with a server?
Or is it 'only' for big enterprise stuff where you typically have a happy mix of countless of different systems that needs to communicate in a way or another?
I'm a bit confused as to what they're useful for and I think that with example of where they're appropriate and where they're not appropriate I could better understand their use.
This is a great question.
The main uses of messaging are: scaling, offloading work, integration, monitoring, event handling, routing, networking, push, mobility, buffering, queueing, task sharing, alerts, management, logging, batch, data delivery, pubsub, multicast, audit, scheduling, ... and more. Basically: anything where you need data but don't want to make a database request. (Caching is another, longer story).
Another way of looking at this is to notice that many applications used to be built by assuming that users (people) would perform actions that would be fulfilled by executing a transaction on a database (including reads, writes). But today, many actions are not user-initiated. Instead they are application-initiated. For example "tell me when the book that I want to buy is in stock". The best way to solve this class of problems is with messaging of some sort. Whether you call it middleware or web push or real time salad dressing does not matter. It's all messaging.
When you enable applications to initiate or react to events, then it is much easier to scale because your architecture can be based on loosely coupled components. It is also much easier to integrate those components if your messaging is based on a stable, scalable, serviceable tool, preferably using open standard APIs and protocols.
I hope this helps. We try to maintain a list of useful links about messaging here
Please get in touch with questions and comments on any of this, we are dead easy to find.
To address your specific questions:
In which domain are they typically used and in which domains are they typically not used?
Like databases, messaging systems crop up everywhere.
For example, say, is Google using such solution for it's main search engine or to power GMail?
Google uses a lot of home grown technology, but a lot of their open source contributions and known use cases suggest that messaging is (or should be) central to some of the main services.
What about big websites like Walmart, eBay, FedEx (pretty much a Java shop) and buy.com (pretty much an MS shop)? Does MOM solve a need there?
Very much so.
An example use case is scaling web page requests. When the user makes a web request, the web server puts it onto a queue for background processing. This means that the web server can keep working while the request is processed. It also means that the web server does not need to know how the request is handled, making system maintenance, upgrade and rollback much simpler because the main parts are 'decoupled'.
So, anyway, the web request gets processed by a back end service, or possibly by many services, eg 'look up book titles', 'draw shopping cart', 'get advertisement', 'check user account'... Finally all the results get put onto another queue, ready for collection and user response by the web server. Typically the system will include a timeout of around 100ms so that any late requests just get thrown away. The user sees anything that got processed in the time interval. This is one reason why some large ecommerce sites have pages that appear to load in stages.
There are many more use cases...
Does it make any sense when you're writing a Webapp where you control the server-side and have an homogenous environment (say tens of Amazon EC2 instances all running Linux + Java JVMs) there and where the clients are, well, Web browsers?
Definitely. If you have an unknown, or unbounded, number of users, server side instances, and application latencies, then it makes sense to use messaging, even if just as a scalable substrate for non-blocking RPC.
Does it make sense for desktop apps that need to communicate with a server?
In lots of cases. One very common case is when the server pushes events to the desktop app, eg game event, tweets, price feeds in finance, system alerts....
Or is it 'only' for big enterprise stuff where you typically have a happy mix of countless of different systems that needs to communicate in a way or another?
Definitely not only for those 'legacy integration' cases but they are important too. At RabbitMQ, the biggest customers we have in terms of pure scale or message volume are cloud providers and big web application providers.
I will answer only one answer, from prior experience - take a look at this middle-ware that is employed by big companies here - middle-ware has one purpose - to glue dis-connected systems (written in disparate languages) together so that they can interact with one another and streamline the business process - Entera as I have had experience with, creates a middle layer in which the unix box using processes written in C, interact with the mainframe system (DB2, COBOL) via a front-end written in PowerBuilder (I am not naming the company!).
From the description I have given, Entera is a middle-ware which hosts a number of things - smooth integration of the flow of data regardless of the endian format, ability for different languages to talk to the middle-ware broker (a broker is a CORBA or DCE like process, that conforms to 'The Open Group) that listens on a particular port) and is specified by an IDL which makes a process appear to be local - if you understand the terminology used in Remoting under Microsoft's .NET Framework, you are not far off the mark! The middle-ware generates stubs which are linked at compile-time and manages the creation of the process, hosting it off a port, multi-threading at run-time, and also, the modern front-ends (such as .NET, Java, PowerBuilder even the unspeakable VB6...ok...VB.NET for the purists out there) can interact by opening a connection to the specified port on a particular IP address, and using the stubs generated, can interact with it directly.
Obviously, from what was described you can see how the legacy systems can have new life breathed into it and thus scalability of the process, the major downside of this is the cost factor which can run into thousdands of dollars. Big companies who uses mainframes as their back-end processing systems for billing/invoicing, who generate a huge revenue can obviously afford such an expensive product - to them it would seem like throwing pennies into a pool of water...because of the use of middle-ware which prolongs the business process, and breathe new life into it, can extend the business by a good number of years into the future without worrying about 'legacy' tag attached to it.
Incidentally, I carried this out as part of my thesis for my BSc. in Information Systems which covered this commercial front-end. There was an open source version of the middle-ware available on sourceforge called FreeDCE, but development efforts have declined or stopped.
Edit:
#cocotwo: That is exactly what middle-ware does as you said it is a plumbing tool...message oriented middle-ware is not really heard of AFAIK because I would imagine, the processes (functions) would need to be called as if they are locally visible within the application domain of the front-end to make it easy to interact with.
Using messages may have its advantages over RPC calls in that the messages are queued in a safe-holding area in the event that a network disconnection occurs - there may be some data caching going on within that aspect to allow the front-end to continue regardless...it would be useful in the instances of 'updating a status of a particular billing/invoice number' - a one-way write-data to the back-end via the middle-ware.
Ok, big companies would have advanced systems infrastructure in that technicians are constantly around the clock to ensure a smooth delivery of data-flow so that would have to be factored in. The company that I worked with had IBM Global Support contract to fulfill in order to ensure a maximum uptime 99% with 6 nine's after the decimal point...with hot-swapping/balanced-clusters/mirroring systems in place...
Whereas with RPC, if the disconnection occurs, the front-end would have to be restarted or would have to handle the disconnection event. It really depends if the message-queueing middle-ware handles each message in real-time and pass back results to the front-end immediately...
This is where each (Message-queueing and RPC related middle-ware) have their strengths and weaknesses...and also the cost mitigation factor such as support, maximum up-time, development efforts and training - that's a biggie here as middle-ware are really proprietary (despite following the 'The Open Group' layout/standards) and complex to setup and to glue the whole thing together via scripts.
Good answers and discussion here. Our consulting team has two preferred "messaging" solutions: RabittMQ and NXTera a high speed RPC middleware, the contemporary version of Entera mentioned above. My partners and I have developed several solutions using RabittMQ, it is the best tool available in that space right now. Additionally, I happen to work for the company that makes NXTera/Entera.
From experience I can clearly say that both of these products meet the need for reliability and low maintenance as discussed above. There are situations where a messaging service, like RabittMQ, is the right choice -- where Publish and subscribe, certified delivery, Queuing or store-and-forward are required.
In other cases, RPC's (remote procedure calls) are the best and fastest solutions for transactional and distributed processing for enterprise or cloud-based applications. When it is right to use an RPC, but SOAP/.NET (yes these are RPC implementations) are too slow, expensive or complex, a lightwieght high speed RPC middleware like NXTera/Entera is the right choice for us.
There is some use case overlap between RPC middleware and message oriented middleware, and where there are you can use either successfully. But both are strong and dependable choices.
The large companies I work with use both RPC and MoM side-by-side. As far as Internet companies, Google (Protocol Buffers) and Facebook (Thrift) show that RPC's have a roll to play in modern web and cloud-based development.
I've been interested in 4D SAS' database product for a long time, though have barely touched it in eons.
In considering what tools to use for application development, especially one that will require a database component, what should be looked for when considering open-source tools like MySQL and PostgreSQL vs proprietary solutions like 4D or Pervasive SQL?
What good (and bad!) experiences has the SO community had with various DB tools like 4D, Pervasive, FilemakerPro, etc?
Any bad experiences?
Difficult to make a relevant list of Pros and Cons without a context.
My advice would be the following: when making the decision of using a proprietary database, make sure that this decision is based on strong facts and not merely a technical interest for an exotic tool. Put into the balance the benefits for using the proprietary database and the advantages of a non-proprietary solution.
The answer is different from system to system.
A prerequisite is that your system is well identified, with a clear scope, a quite predictable evolution, so that the results of your analysis will be robust. Then, if your proprietary solution brings a real benefit for your system, that you are comfortable with the support and that you can afford the overall cost, you should be a good candidate for the proprietary solution.
4D is a MacOS/Windows only cross-platform, proprietary database system with both stand-alone and Client-Server varieties. You would do well to compare it to Alphafive.com software which is Windows only. I've worked with it for 17 years and it has served me and my department very well. Off the top of my head ...
Pros:
Interface & code are closely tied to the data engine which makes development of rich, cross-platform user interfaces very fast and easy.
Proprietary relational data engine runs natively on both platforms, along with native client interfaces (but requires licenses for multi-users). Auto-relations are helpful (but sometimes get in the way).
Can access external systems via SOAP and ODBC and SQL drivers (limited).
Can access 4D from external systems via SOAP or http requests & web pages.
Native procedural programming language based on Pascal and is EASY to learn.
Excellent tool for small to mid-sized departments.
Latest version accepts subset of SQL commands AND original data access, so it's backward compatibility record has been very good.
Security is EASY in 4D.
You can build solutions to deploy through a variety of means, and are not limited by whether or not MS Access is installed.
Cons:
Interface & code are closely tied to the data engine which can lead to limited use of abstraction and "black-box" coding unless you make it a goal of your development.
Compiles to one monolithic structure file forcing restart for single fixes.
Language is still only procedural--making it harder for object-oriented programmers to accept. Every method requires separate "file" in 4D so you can't include more then one function or procedure in a single routine -- it will take some getting used to it.
While company appears to be in good shape, growing and developing, you simply never know as they keep their condition to themselves.
Company has never really marketed itself--trusting in its developer base to spread the word and grow the product through site deployments and product upgrades. Web site is clearly useful only to developers who already use the product -- it simply fails to attract new users.
Product upgrades have always seemed to focus on how the tool is better for the DEVELOPERS rather than for the CUSTOMERS of those developers.
SQL lacks views, compound indexes, and other common SQL features.
When a user requests a report of specific columns of data, I often have to write yet another program just to provide that specific data -- I can't always just query the data and generate a text file.
Does not handle new OS versions with nearly the ease of web browser based applications. Older version is broken on Mac OS 10.6, and newest version requires the latest Mac OS 10.6. No version is certified yet on Windows 7.
I've been nearly a year at learning ASP.NET and a few weeks at Ruby on Rails. While SQL data stores are EASY, user interface is HARD -- but worth it when your application still functions through OS upgrades. You can always use an older browser if the latest version breaks something.
I'd recommend you consider either of those, depending on how much funds you have available to implement the project--Rails being the cheaper of the two. Then, ANY system with a web browser can access the data, and you can fix interface pages on the fly as needed rather than taking the whole system down a few minutes for a single, simple update. Those skills might be more marketable in the future.
I will only say one thing.. Watch the "actual" cost of your decision.. Most proprietary database systems are Windows only.. or sometimes Mac/Windows only.
This means that along with paying quite a bit of money for the database system, you must also pay a good amount of money on a Server operating system to run it...
Also, compare the database system with current open source solutions. Is it really worth it? After moving from Microsoft Sql Server(which has a free edition, but anyway) to PostgreSQL I was blew away that people pay so much for SQL Server.. I mean, Postgres to me is a lot more clean, and most of it works exactly how you'd expect(unlike in certain SQL server syntaxes) and it has more features built into it(programming stored procs in Ruby anyone?)
So basically, compared the proprietary with the open source software and decide upon which one to take by total pricing(including OS) and feature set..
Pro of zeroing in on any DB: it's got good non-portable features that help you get things done
Con of zeroing in on any DB: sometimes a different DB is appropriate (for example running your tests with in-memory SQLite instances), but that option is now closed
Con of a proprietary commercial DB: if you need many instances, licensing costs can kill you
Consider the following questions:
How easy (or difficult) is it to make changes in maintenance? Applications are likely to spend far more time in maintenance than they do in development, so if changes are hard, long-term pain is guaranteed.
What is the quality of support? A system that is well-documented, proprietary or otherwise, is going to be easier to work with.
How large (or small) is the user community? Systems with larger user communities mean more people to ask for assistance if and when things go wrong.
How robust are the import/export capabilities of this proprietary database system?
I found the last point particularly useful at my first full-time job. Our client was using CA-Ingres, and no one at the company knew it well enough to write queries to validate the data. So I came up with the idea of exporting the data from Ingres and importing it into MS SQL Server (which I knew from a brief stint at Sybase Professional Services) so we could write our validation queries there. If it had been really hard to export data from Ingres, my idea wouldn't have been an option at all.
From 4D's webpage, I gather that we are looking at a complete development+deployment environment, not a standalone database as such. So the alternatives you could be looking at include stuff like django, ruby-on-rails, hibernate and others. The real question, of course, is if the proprietary system can save you enough money doing the product lifetime to justify the costs of the product. And that would depend on the type of human resources you have available.
4D is a good option for vertial applications. I have worked for a company which used 4D to build a medical records and billing application for general practitioners and specialists. The rapid design and deployment features of 4D enabled the application to quickly move with market desires and legislated changes to medical record storage.The environment itself was not cutting edge, but it was integrated, cross platform and very productive.
If you are entering a market with high vendor lock-in and a high barrier to entry, then I think proprietory integrated development environments are a good option.
At various points in my career, I've used and gotten very good at FileMaker Pro, FoxPro, 4D, and a few other commercial products. Now I mainly use PHP/MySQL, and haven't used the latest versions of any of the products.
I've always liked FileMaker because most people who can use a computer can pick up FileMaker and design their own systems. They don't have to know programming or database design. But, you can "program" FileMaker, put a web front end on it, or do other more sophisticated setups if you need to. Many times I was "handed" a system created in FileMaker by a non-technical person that needed to be made into a full fledged data management system. The good part was that all the "specs" and data flow were already designed into a system. The prototype was already created!
4D and FoxPro I always found required a certain amount of extra programming and/or database knowledge to really do anything with. 4D & FileMaker are really complete self-contained systems, not just database systems. Although they all have the ability to hook into other backend databases systems (i.e. MySQL, Oracle), that is not their strong point.
On the downside, doing more complex, dynamic systems can be difficult in 4D and Filemaker due to everything being tightly coupled. Because of their cost, you really would want to create multiple systems with them. Which means you need to really "buy into them" to get your money's worth.
The key concept is always adherence to standards: if you plan to use 4D's custom and / or special designed functions (but the discussion could be far more general, and cover any other free or commercial tool in the wild), well, just use it and take your advantage.
Not surprisingly, that's why huge DB systems like Oracle or IBM's DB2 in the past were wide accepted for specific business areas, as commercial transactions, for instance.
The other main reason to adopt a very closed solution is the legacy support. One of the products you cited (Pervasive SQL) acted as a no-effort port for BTrieve-based applications in late 90s, and it gained popularity thanks to the huge BTrieve community all over the planet.
Finally, last but not least, you should evaluate the TCO (Total Cost of Ownership) not only in terms of license price (single seat, network environment, site licenses and so on), but also for what concerns tech support, updates and availability for your platform. Many business units I know have been obliged to change their base OS for DB related problems.
Tip: add a bonus for custom solution that are proven or supported for usage in virtualized environments, if you aren't in seek for extreme performances. It will save more than a head ache for your DB manager.
In all other cases, rely on opensource/freesoftware DBs. MySql and Postgres for big projects, SQLite for single app persistence layer. Fairly standard and very good (community) support. Good value for no price.
I don't have any experiences with the proprietary database products you listed: 4D, Pervasive, FilemakerPro.
I'd be interested in knowing what those products offer that make them more attractive to you than the open source alternatives, you listed: MySQL and PostgreSQL.
I'd be interested in what makes those more attractive to you than the much more popular proprietary alternatives: Oracle, SQL Server, DB2, etc.
Without you providing more specifics, it's hard to advise you.
I personally feel safer using a widely used open source solution than a narrowly used closed source solution. The more widely used, the more battle-tested it's likely to be. The more open, the more control over my own destiny I have in case I do encounter some bug.
I have reported bugs to open source projects and gotten a quick fix. I have reported bugs to companies that make for-profit proprietary software and have gotten nothing.
I'm currently writing a large scale ASP.Net web app.
One of the thngs I can't find out about is how to justify when to use the cloud. E.g. when should I use google app engine/azure?
Also, when would I want to use bigtable over a standard dbms such as Sql Server?
Thanks
Cloud computing is all about scalability. It allows you to scale up AND scale down without having to rework your designs.
It works well for small sites, since you are only paying for resources used, but if you need to scale up, it just happens automatically (provided your application was designed for the cloud).
Also, there are theoretically much better tools in place for maintaining uptime and reliability in the cloud. For example, a system upgrade can happen without stopping your service, since the cloud computing platforms can automatically take on or off servers to service your application.
There's been a lot of talk about that from the Azure devs.
Also, there can be a financial motivation for using the cloud. Using a hosted cloud architecture can be less expensive than managing the multiple servers (DB, web, etc) that would be required for a traditional site, at least up front. As your usage goes up, the cost follows, but in theory, it can be more cost effective.
I'm not too familiar with anything else except app engine and EC2.
I'll try to add something to the previous answers:
The best thing about app engine is it's free until you attract a certain amount of users and you are charged for what your application uses, idle time is not charged.
Big table may differ from an rdbms architecturaly but from a perspective of a developer using it it's not that different.
Another good thing is python is supported. The bad thing is the standard library is crippled.
Also, you don't have full control over your data on the cloud (appengine), what I mean is you can't completely restrict the people from google from taking a peek in what you store there.
This question is very closely related to another question asked today:
"When shouldnt-you-use-a-relational-database?"
Relational databases and non-relational databases (like BigTable) address different needs. Not only in scale and performance, but in the structure and usage of the data.
The "Cloud" as I understand it is about scalability primarily. That is, the architecture refers to a capability to increase capacity in a scalable way.
Also, the Cloud is used frequently in reference to the Software-as-a-Service (SaaS) model, where someone else takes care of the servers, but that's an independent issue from the Cloud architecture. I.e. you could operate your own set of servers in a Cloud architecture.
So the justification for using the Cloud architecture is that you have an application that has a variable need for computing capacity. So it would be overkill to have N servers dedicated to match your peak level of activity. The Cloud allows you to vary your usage of the servers as your level of activity grows (and diminishes) over time.
The justification for using a SaaS model is that you don't want to be in the business of operating a data center. You're willing to relinquish some control and pay for the service, so that you can leave operation details to the experts in that technology. They handle backups, hardware failures, upgrades, 24x7 operation, etc. You handle your application and your business.
I recommend you subscribe to and read the High Scalability blog, especially some of the most visited posts such as those about the architecture of various large sites, as you will learn a lot from it that may help you make a decision. There is no hard rule as to when you should or should not use a cloud service or move from a relational database to a keyvalue system like BigTable.
One upside of cloud services in any case is that if you build your application with them, it will be immediately scalable and require much less rework later on if you require that kind of performance. However, in view of premature optimisation, it would be wise to be sure that you need that kind of scalability before you decide to build your app on such a platform.
There are several concepts to wrap your head around when using a datastore system like BigTable as well, such as not being able to just slam out writes like you would in a relational database, and having to precalculate a lot of your data rather than just doing that based on info from the database.
Although again, you can learn a lot from reading the abovementioned blog and related posts about Youtube, Plentyoffish, Google, etc.
You say you are "currently writing a large scale ASP.NET app". If you have made significant progress on it, you are already pass the point where you can justify using Google app engine or Azure. Both require significantly different architectures than you have build with a traditional application due to language support, database differences, and maturity.
Google App Engine is Python only so switching to it would require a complete rewrite
Big table is not a relational database and requires very different coding patters. SQL Data Services originally announced to be non-relational as well, but is moving to be more relational. I have not seen how close to a standard MSSQL database it currently is.
I would consider Google app engine to be a relatively immature platform so far. Database functionality is limited, you cannot run background processes, profiling and performance tuning tools are limited at best. Azure is currently in limited community preview, and so is not even available to ship a product on today.
While there are many very valid reasons to use a cloud architecture, moving to it will require significantly different architectures. Think about what effect changing that architecture (and possibly waiting for platform availability) will do to your release date.
If you are early in your project, cloud vs. not cloud is a great question to ask. If you have well on your way, I think that the importance of getting to shipping code and leveraging the work you have already put in should trump any benefits to the cloud you may see.