I'm evaluating some database possibilities for a large-scale (many billions of entries, many terabytes of data) storage solution where we will do random primary key-lookups almost exclusively.
Given it's capabilities, Membase (Couchbase 1.8) looks almost like a perfect fit, and some previous tests makes us believe it is highly performant for our usecase. Our main concern with using this though, is that since Couchbase 2.0 looks like a whole new direction from 1.8, the characteristics of the product might change. We like Couchbase 1.8 because it does exactly what we need very well.
We don't need views or map/reduce capabilities. While these are nice features to have, they are not something we need and not something we want if they are at all detrimental to performance. We've ruled out CouchDB mostly due to the complexity in scaling (adding/removing nodes) which is of course much better in Couchbase, but also because we had some concerns about it's disk usage.
Is anyone aware of any performance measurements made for 1.8 vs 2.0, disregarding all view and M/R capabilities?
Will the 1.8 fork continue to be maintained? Or is 1.8 dead, and we should just move on?
There will be no major differences in performance between Couchbase 1.8 and 2.0. I am a Couchbase employee and one of the most important things to us was that existing customers are able to upgrade from Couchbase 1.8 to Couchbase 2.0 whether or not views were important to them and have their applications continue to perform at the same levels.
I recommend starting with Couchbase 1.8 since our 2.0 product will not be released until the end of October 2012, but when the time comes to upgrade to 2.0 you shouldn't have any issues.
NOTE: As December 2012, Couchbase 2.0 is already available
Related
We have a reasonably big project in Django, that had started to push at the limitations of Django (we mostly use Django for database-related stuff, not web interface), and we decided to switch to SQLAlchemy, while it's still possible (we don't want to get ourselves in this position:).
The problem is, it really seems this is the worst time we could have picked. SQLA is on the verge of releasing version 1.0, which will probably be a big change in the interface. More importantly, it seems like there is some trouble with releasing it: more than a month ago, Mike Bayer tweeted that release candidate will be available via pip --pre, but it still hasn't happened.
Docs are updated to 1.0, and the bitbucket repo shows no diff between master and 1_0 branch. If this were Django, I'd just clone a repository and install it directly - there is an official blessing for such method in Django documentation. But I can't see any hint that this is "accepted behaviour" in SQLA community. For example, installation page doesn't mention 1.0 at all.
Am I too paranoic? Should we just use 0.9.8, and then make a few changes when 1.0 comes out? Or should we build 1.0 manually? Or it would be better to wait? (How much? I realize SQLA team doesn't want to heap up pressure to themselves by talking about an release date, but Mike has kinda already done that with that tweet.:)
I realize this is not exactly an objective question, but someone having a knowledge of SQLA process might have valuable advice. For example, if someone asked me same thing about Django 2.0, I'd tell them "if it isn't a mission critical app, just clone and build from the newest repo state - the chance of breaking is small, and you're getting much better interface". And I'd have official docs behind me.
As of the day of rewriting this answer, to answer how to choose from SQLAlchemy 0.9.8 (stable version released on October 13, 2014) or 1.0 ("upcoming" version), personally I will pick the stable version.
As a software life cycle, beta / bleeding edge / nightly build versions tend to have more bugs or breaking changes, which will directly lead to breaking up your system / script.
Therefore, choosing a stable version is more appropriate in most cases, unless you want to have the new feature in the beta version.
Last, there are usually migration guides to upgrade your version, but not downgrade your version. In some cases (but probably not in SQLAlchemy case), upgrade is sometimes irreversible.
An interesting question is how pentaho data integration fits and perhaps would be useful in an environment that involves BPM (bonita software) and ESB Enterprise Service Bus (Mule).
I didn't find any documentation about it. Maybe I`m misunderstood these two conceptions but I really would like to know how and when I can use these two approaches.
To be more clear, how I can use pentaho data integration to improve the business workflow and be a tool to work together with an ESB platform ?
It sounds like a very generic question about how to do system integrations.
You will have your high level (business perspective) business processes guiding your company, probably gathering data from and showing business data through Pentaho and the ESB will be in charge of handling how the systems used by the business processes communicates with each other.
I wrote some time ago these slides for jBPM5 but I think it will help you to understand how all these technologies fits:
http://www.slideshare.net/salaboy/jbpm5-community-training-module-25-bpm-for-developers
Cheers
There is an integration of Mule and PDI, but it doesnt appear to have been used much. see here: http://jira.pentaho.com/browse/PDI-7416
There is also an enormous overlap in the tools. Obviously Mule contains ETL functionality - and similarly PDI can do ESB like operations. So there is good sense in integrating and using the best of both!
Certainly mule/ESB seems to be where it's at with the whole "data in motion" concept.
I have often wondered why MySQL has become so popular. Any ideas why? Are there specific reasons behind its success? (Please keep answers analytical)
It is free, which means it sees more use on personal projects as well as on hosting platforms that provide a DMBS solution.
It is one of the few solutions that can run on almost any operating system.
It uses basic SQL rather than a specialized variant, meaning that it requires less-specialized knowledge to use.
Setup and configuration is more straight-forward and less time-consuming than most other options.
added more spices, is pretty fast for myisam
for what is meant free
if you using oracle, and you want to setup multiple instances on different boxes, you probably required to pay for each boxes.
unless, you have big budget to spent, oracle just don't sounds great
postgres is also free
mysql is easier to learn due to it's friendly sql (not standard compliance)
Early support in languages like PHP had a bit to do with it as well. While MySQL's C API is relatively straight forward (provided you are comfortable managing your callbacks), the PHP implementation made it crazy easy to use. Some would argue too easy to use.
I've worked in the hosting industry for quite a while, and notice trends. Almost as soon as PHP added support for SQLite3, people started asking for it to be installed. I'm not saying that PHP is the only contributing factor, nor can I guess at just how much of a factor it was, but it did have a bit to do with it.
After all, they call it LAMP for a reason.
It's opensouce and free (Community Edition).
Ubiquity, cost and performance.
Open Source - free of Cost [GNU license] - It is one among LAMP Linux, Apache, MySQL and PHP. So Its is suited with develop websites in PHP., etc.
Light Weight - Usually in Web development the space is an important issue,MySQL occupies less memory as compared to others so it is considered to be light weight.
While there is fight between gaints like Oracle vs SQL Server on Enterprise application, the MySql focused on the WEB Development and made popular.
...
Simple question, I want to know is Java free (especially for web development). Later on if I've build a large website, will the servers, databases cost me much like in .net for example?
Cheers.
Java is free. Check licences of frameworks you're using, but you shouldn't worry about that since most of them are free.
Servers (physical) will, of course, cost you.
There are free application servers like Glassfish and JBoss.
There are free databases like mySQL and PostgreSQL.
So, you can get away with everything being free except hardware and, possibly, hosting of your web app.
Yes and no - depending on how big your site gets, you may be required to invest more money into better servers/databases.
It's not really something that can just be answered, without looking into the future.
Update, as of 2021
Be sure to read the document prepared by pillars of the Java community, Java is Still Free. This document provides a short overview as well as a longer section with all the gory details.
Understand that Java is a set of specifications, not a product.
Java Language and Virtual Machine Specifications
JEPs and JSRs
Many vendors provide binaries or installers for an implementation of Java. Nowadays, all of those implementations are based largely or entirely on the OpenJDK project. Participants including Oracle, IBM, SAP, Apple, Azul Systems, and more have banded together to pool their best technologies for implementing Java as open-source free-of-cost.
The OpenJDK project provides only source-code. Various vendors build that source code to provide binaries or installers for you and me to conveniently put Java on our computers. Some of their distributions of Java are available free-of-cost, and some are commercial with paid support. Some are general-purpose JVMs, and some are special-purpose. Some are a basic JDK and some have bundled extras.
Here is a diagram I made to help you in choosing a vendor for a Java implementation.
And some considerations to think about when choosing a vendor.
When a new version of a framework or language appears (e.g. .NET 3.5, SQL2008), what approach do people take to when to adopt/upgrade?
Generally developers will say as soon as possible (they want it on their CV and from a management perspective giving them what they want provides a motivation boost) but commercially there is often little incentive (few clients demand the latest version) and from a cost perspective (retest, training) there is often a disincentive.
I'm particularly thinking of "on-going" systems and projects (such as in a software house) which exist and evolve over years where taking the "new projects use the new technology" approach doesn't work.
Are people driven by specific requirements (the need to use a new feature, a potential or existing client demanding support for it), do they formally assess it (in which case what are the criteria) or do they upgrade as a matter of routine (in which case when - leading edge vs. bleeding edge)?
Do people think that not being on the latest version of something should be considered technical debt and managed as such?
Or is "if it ain't broke don't fix it" a valid approach?
Read up on Technical Debt. This is a simple cost-benefit decision.
The "if it ain't broke don't fix it" is a common management policy that says "tomorrow's dollars aren't worth as much as today's, so don't plan for future improvements." Eventually technical debt accumulates to the point where the product can no longer limp along.
The most common breaking point is when some piece of the infrastructure is no longer supported. By then, incremental change is impossible.
Reinventing from scratch is a new capital investment. Fixing existing code is an expense. The accounts force management to make technically crazy decisions.
In the case of open source software, it requires careful technical management since there's no official "support sunset" announcement from Oracle/Sun. Bad technical management, of course, leads to technical bankruptcy.
We look at the support lifecycle costs. For how long are the older versions supported, and at what costs? Platforms like Windows and Java tend to move fast as compared to mainframe environments, and part of the cost of doing business on those platforms is to perform periodic upgrades. In a rational world, that is!
New versions can have killer features we need -- but that is rare in enterprise development. The main positive selling points of new versions (as opposed to negative ones such as expired support) tends to be greater developer efficiency, which is hard to measure. Against that, as you indicate, the cost of retraining must be considered, not only for the initial developers, but, crucially, for maintenance. In each upgrade, some applications tend to be left behind as too critical to retire, and too expensive/fragile to upgrade. Over time, the number of platforms and versions you have to support increases overall technical debt (no matter their age).
Another criterion for upgrading to new versions (which you note) is the ability to attract and retain staff. With the current economic phase, that's playing second fiddle, but still cannot be ignored completely. You want to have at least a seasoning of enthusiastic and knowledgeable developers.
I think the killer question is whether your app will survive long term if you NEVER upgrade the platform/language version. If you think it can't, you may as well upgrade sooner rather than later, as it will only become harder.
Think about how long your app should be actively developed until you need a full rewrite. If you never plan to rewrite it, I would upgrade continually. Consider how difficult it will become to find the best developers if you are working in an outdated technology. Consider how new framework/language features could speed up your development process in the long term, for a bit of short term pain.
When you really need to. .NET 1.0 was crappy, 1.1 was a nice upgrade, but Web development with VS2003 was not so smooth. Things improved with VS2005 and .NET 2.0 – and I see still many developers and companies are stick to .NET 2.0. Previous versions were so fresh, version 2.0 was mature tech. So, if you were happy with 1.1, why would you upgrade? If you are happy now with 2.0, why upgrade to 3.5 or 4.0?
When the benefits of upgrading (more features, or a bugfix you need) outweigh the risks/costs involved (new issues, breaking existing code).
When you develop for Microsoft based platforms, like a Windows Forms App for Windows or ASP.NET webapp for Windows Server, the nice time to migrate is for every two major versions of OS.For example, if your app has been developed for Windows 2000, you ought to migrate to Vista though XP can be neglected. Similarly, if it were designed for XP SP2, you can safely ignore Vista and target Win 7. Usually Microsoft never breaks (or rarely breaks) incremental OS updates. So an app running on today's OS will definitely run on the next. But never on the one following it. (It if runs how can M$ make money???)
Source: Self... Windows Developer for over 5 yrs)
I'm in the upgrade as soon as possible camp (though I might wait a month after a new version come out just in case for uncaught issues). There are a few things you need to think about:
1. Security Releases
Many of the people who tell me if it isn't broke don't fix it are also the same people who would close their 2 eyes when security patches get released. Think Equifax.
To me it is an ethical responsibility to at least be on security supported versions of a framework. We owe it to our customers to safeguard their data.
2. Attracting & Retaining Talents
There are lots of talk about how the programming language or framework used doesn't matter. But in my experience, the cleanest code and design for a web app are usually written by the people who are passionate about the framework & programming language used because of their experience & expertise with it.
These people are unlikely to stay around for long or join your company if you stick to a very old version. Please think about your developers' happiness.
3. Newer, simpler ways offered by the newer version
Very often newer versions of a framework make something hard in the past much easier. If we do not upgrade, we miss out on the good new packages/features and we write our code in the old frustrating way knowing there is a much simpler way to achieve the same feature. And when it comes time to upgrade, we may end up having to change again to the new way. So why not upgrade and use the new better way and waste less time?