any formal benchmarking of Open source Database software? - mysql

Is there any formal performance and stress test reports of open source database, specially sqlite,MySQL an PgSQL?
I want to use sqlite in server for its simple structure and easy embeddable capability. But I can not find any pros and cons (by Googling and Yahoo!ing) regarding performance of these database software.
Please suggest.

I found this article. It has a disclaimer at the top about the age of the information. However, it may be some help to you.
Here is another article that seems a little more recent and up2date.
Seems from reading these that SQLite is quite adequate in terms of performance.

Sysbench is a great utility for benchmarking mysql and I believe has plugins or the capability to test PostgreSQL. Keep in mind that you're not going to get a simple number that says "DBMS A is faster than DBMS B" -- at best you can hope to get an idea of what kind of scaling you'll get for a particular type of workload that is hopefully similar to whatever workload you'll end up throwing at your system.
Regardless of performance, if you really know what you are doing with RDBMS software and need an open source solution, you'll probably want to go with PostgreSQL -- otherwise, stick with MySQL.

Benchmark is not the most important in database choice.
I think SQLite and MySQL are quicker than Progres or Firebird but if you need some specific features like CTEs, only few database have even if it is SQL Standard

Benchmarking is hard. And expensive. And in the installations most of them are done, SQLite won't even be tested, because it's designed for completely different workloads and simply doesn't deal with the situation. (For example, any real benchmark will have clients running on different machines from the server, which SQLite AFAIK doesn't really do - whereas it does do very well in the case where you have a single client locally).
You can always look at something like spec, for example http://www.spec.org/jAppServer2004/results/jAppServer2004.html that shows both pg and mysql at least. But beware that the hardware platforms are different (and that these tests are also not from today).
But the bottom line is that if you want to compare performance for your application, the only really relevant benchmark you can run is your own application in a testing environment.

Related

MySQL vs PostgreSQL Concerns w/ GIS & Speed

I'm aware there are a few threads out there addressing this issue, but I'm wondering if anything has changed since those have been published.
I'm looking to build a GIS webapp, and people are all saying that PostgreSQL is the way to go because it supports various things that have to do with mapping better, whereas MySQL's spatial extensions aren't too great.
So PostgreSQL seems like the way to go, but everywhere I go I'm reading that PostgreSQL is terribly slow compared to MySQL, is this still true?
If I want to use GeoDjango with MySQL, will I be able to do most everything?
I'm really stuck between the two, simply because people keep saying PostgreSQL is really slow, but MySQL isn't really great for dealing with GIS stuff.
What's your take SO?
No, postgresql is not slower. This myth is due to people running single threaded sequential benchmarks on myisam vs postgresql. Benchmarks that attempt to model actual usage conditions with many concurrent queries put postgresql on par with or ahead of mysql in performance, especially as you scale up in CPUs/cores.
http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html
http://tweakers.net/reviews/657/5
In my opinion, it's silly to compare MySQL and PostgreSQL in terms of speed if there are variables unknown, such as - what's your budget, what's your target system output and what's your load rate?
Both RDBMSs are great, and they can be scaled. The difference is that MySQL has pluggable engine architecture, allowing it to plug in various engines. Natively, MySQL supports 9 engines if I'm not mistaken but it has a plethora of commercial engines to choose from, along with 2 popular forks (Percona's and MariaDB) that introduce various enhancements, especially for InnoDB storage engine.
Real question is, what does it mean that something is "bad" at GIS "stuff"? What does bad mean? Can't calculate something? Can't store something? I just don't get what you consider bad really.
I doubt you can go wrong by choosing either of the two databases, just beware of false benchmarks claiming one product is faster than another. Set your goal in terms of performance, install both products on your test machine and run them. If both satisfy your performance needs, use the one you feel more comfortable developing with.
Check this topic: GIS: PostGIS/PostgreSQL vs. MySql vs. SQL Server?
PostGIS is much more mature and complete, and competes with Oracle and SQL Server, not MySQL. Sorry.
When it comes to GIS capabilities, have a look at this GIS SE question:
Would PostGIS offer an advantage over MySQL for a produce farm application?
I think that from all that I read here and on GIS SE site, PostgreSQL with PostGIS is a clear winner when it comes to handling spatial data.

Database Management System for db-heavy, busy website/application?

EDIT: I've learnt, and it's probably true that YouTube uses MySQL. But it probably would be the enterprise edition and not free edition. The only alternative seems to be PostgreSQL. Long question short - - Can PostgreSQL used instead of MySQL? Is it a very good alternative in any case?
Firstly, I noticed that these are the most common names when it comes to (relational) database management systems - - DB2 (IBM), Oracle Database, Microsoft SQL, Ingres, MySQL, PostgreSQL and FireBird. So, should I presume these are the best?
Okay, of the above - - DB2 (IBM), Oracle Database and Microsoft SQL, the so-called Enterprise DBMSs, come with a bill; while MySQL (exclude enterprise version), PostgreSQL and FireBird are open source and free.
As should be clear from my previous questions here, I plan to build a photo-sharing site (something like Flickr, Picasa), and like any other, it's going to be database-heavy and (hopefully) busy.
Here's what I would love to know: (1) does any one of the free DBMSs stand up to the mark with the paid enterprise DBMSs? (2) Can any of the free DBMSs scale and perform well for enormous and busy databases without too much headbanging and facepalming?
Things in my mind w.r.t the DB:
Mature
Fast
Perform great/fine under heavy load
Perform great/fine as database grows
Scalable (smooth transition)
support for languages (preferably Python, PHP, JS, C++)
Feature-rich
etc (whatever I am missing)
PLZ NOTE: I know Facebook, Twitter etc use (or at least used) MySQL, and I see reports from time to time, how their sysadmins cry over that decision. So, please don't say, XXX uses it, so why can't you. They've started small, I am too. They've made mistakes, I don't want to. I want to keep the scaling-transition smooth. I hope I am not asking too much. Thanks.
"Which is the best database" is a huge question and is the subject of much contention. I've noticed on StackOverflow there is a tendency to close such questions; although the question is interesting, it is also quite unresolvable ;-)
FWIW, I would go with this:
Use what you know
If it doesn't conflict too heavily with the first rule, use something that is free of charge
Use what works with other parts of your stack
Use what you can hire for at reasonable cost (so, maybe not Oracle unless you really have to)
Don't optimise too early. Working slowly is much better than an unfinished, efficient website.
Also, scalability is not really to do with your db platform, but to do with how you design your site. Note also that some platforms scale better when adding more servers (MySQL) and others do better when increasing your server resources (PostgreSQL).
Please note as of today, MySql is not a free project aka as free as postgresql. One of the main reason why i had to switch over to PG. (Thankx to NPGSQL and PgAdmin III, it was a lot easier than it was rumoured)
However MySql does have number of advantages related to applications,addons,forums and looked good on resume.
PostgreSql is a much mature DBMS. It is a objectRDBMS. It has been around for more than 15 years. It is not known to have defaulted on any major issues. It is well known to handle transactions running in millions of rows successfully. The most important is, it's high rate of compliance with SQL standards. Infact in professional circles, it is more of an Oracle of Free RDBMS rather than MySql of popular applications.

Right DBMS for the job?

I need a DBMS, but do not know which to choose.
Basically, the application makes many INSERT / UPDATE, but also many SELECT. SELECT mostly very simple, one field only.
I am using MySQL + InnoDB at the moment, but as the database is growing, I need the best solution. The table can grow indefinitely, and the time +- 2GiB
EDIT:
Will run on Linux, and perhaps rarely in FreeBSD.
Not need a user management, all processes currently connect as root. Typically, there are many simultaneous accesses (now in 83 threads, according to the mysqladmin).
Access will be with C++, but need access to PHP also
PHPMyAdmin statistics:
select: 42.57%
insert: 7.97%
update: 49.45%
EDIT2:
After some thought, and the answers here, I believe that I can't use MySQL for your client library is GPL
Any alternative that does not harm (much) performance?
I think you have plenty of options.
You can continue to use MySQL. YouTube have used it fairly successfully
PostgreSQL (Free, Open Source, pretty good performance, reliable)
Oracle (NOT free, but has good support for very large databases)
If it's very simple queries, could it be done well with a key/value store?
According to this, the maximum database size on Linux 2.4+ (ext3) is 4TB. So I think you are safe to stick with MySQL+InnoDB if performance is adequate.
I would think MySQL is an excellent choice from what you've stated. Oracle isn't free, and has some overhead in all the security and enterprise level features that MySQL doesn't. You want support for multiple languages. MySQL can scale well (I believe Flickr is a good example). Most databases are accessible via most languages: e.g. Perl, Java and C all have driver based APIs ( JDBC, DBI and ODBC ). IIRC PHP has one very similiar to DBI. Also: starting with a database does allow you some wiggle room for the future: e.g. joins and aggregation.
One advice I would give is: make sure whatever you choose is ACID compliant. Also, You might take the time to compare PostGres and see if there is something about it that meets your needs as well or better than MySQL.

Which is the Best database for Rails application?

I am developing a Rails application that will access a lot of RSS feeds or crawl sites for data (mostly news). It will be something like Google News but with a different approach, so I'll store a lot of news (or news summaries), classify them in different categories and use ranking and recommendation techniques.
Should I go with MySQL?
Is it worthwhile using IBM DB2
purexml to store the doucuments?
Also Ruby search implementations
(Ferret, Ultrasphinx and others) are
not needed If I choose DB2. Is that correct?
What are the advantages of
PostreSQL in this?
Does it makes sense to use Couch DB in
this scenario?
I'd like to choose the best option but without over-complicating the solution. So I discarded the idea to use two different storage solutions (one for the news documents and other for the rest of the data). I'm also considering only "free" options, so I didn't look at Oracle or MS SQL Server.
purexml is heavier than SQL, so you pay more for your roundtrip between webserver and DB. If you plan to have lots of users, I'd avoid it, your better off letting your webserver cache the requests, thus avoiding creating xml(rss) everytime, if that is what you are thinking about.
I'd go with MySQL because its really good at serving and its totally free, well PostgreSQL is too, but haven't used it so I can't say.
CouchDB could make sense, but not if you plan on doing OLAP (Offline Analysis) of your data, a normal RDBMS will be better at it.
Admitting firstly that I generally don't like mysql, I will say that there has been writing on this topic regarding postgres:
http://oldmoe.blogspot.com/2008/08/101-reasons-why-postgresql-is-better.html
This is always my choice when I need a pure relational database. I don't know whether a document database would be more appropriate for your application without knowing more about it. It does sound like it's something you should at least investigate.
MySQL is probably one of the best options out there; light, easy to install and maintain, multiplatform and free. On top of that there are some good free client tools.
Something to think about; because of the nature of your system you will probably have some tables that will grow quite a lot very quickly so you might want to think about performance.
Thus, MySQL supports vertical partitioning but only from V 5.1.
It sounds to me the application you will build can easily become a large-scale web app. I would suggest PostgreSQL, for it has been known for its reliability.
You can check out the following link -- Bob Ippolito from MochiMedia tells us why they ditched MySQL for PostgreSQL. Although the posts are more than 3 years old, the issues MySQL 5.1 has recently tend to prove that they are still relevant.
http://bob.pythonmac.org/archives/category/sql/mysql/
MySQL is good in production. I haven't used PostgreSQL for rails, but it's a good solution as well.
In the dev and test environments I'd start out with SQLite (default), and perhaps migrate to your target DB in the test environment as you move closer to completion.

Where to find a good reference when choosing a database?

I and two others are working on a project at the university.
In the project we are making a prototype of a MMORPG.
We have decided to use PostgreSQL as our database. The other databases we considered were MS SQL-server and MySQL.
Does somebody have a good reference which will justify our choice? (preferably written during the last year)
Someone recently recommended me wikivs.com: MySQL vs. PostgreSQL - it is a quite detailed comparison of those two, and might be of help to you.
the most mentioned difference between MySQL and PostgreSQL is about your reading/writing ratios. If you read a lot more than you write, MySQL is usually faster; but if you do a lot of heavy updates to a table, as often as other threads have to read, then the default locking in MySQL is not the best, and PostgreSQL can be a better choice, performance-wise.
IOW, PostgreSQL scales better regarding to DB writes.
that's why it's usually said that MySQL is best for webapps, while PostgreSQL is more 'enterprisey'.
Of course, the picture is not so simple:
InnoDB tables on MySQL have a very different performance behaviour
At the load levels where PostgreSQL's better locks overtake MySQL's, other parts of your platform could be the bottlenecks.
PostgreSQL does comply better with standards, so it can be easier to replace later.
in the end, the choice has so many variables that no matter which way you go, you'll find some important issue that makes it the right choice.
Go with something that someone in your team has actual experience of using in production. All databases have issues which frequent users are aware of.
I cannot stress enough that someone in the team needs PRODUCTION experience of using it. Not using it for their homework, or to keep their list of CDs in.
All of these databases have their advantages and disadvantages. Which is better is dependent on:
Your teams experience
Your exact requirements
Your current environemnt e.g. whats your app written in and going to be hosted on?
SQL servers main problem is the cost unless you use express edition which has performance limitations however its very easy to use and has a number of good tools.
There is a comparison of the different sql versions at:
http://www.microsoft.com/sql/prodinfo/features/compare-features.mspx
You could then compare these with MySQL and PostGre.
If the purpose of this comparison is a theoretical one for your essay then you can reference web pages such as the microsoft link and compare performance, cost etc.
Postgresql has a page of case studies that you can quote and link to.
Really, any of the above would have worked for you. I personally like PostgreSQL. One solid advantage it has over MSSQL (even assuming you can get it for "free") is that PostgreSQL is non-proprietary. If you're going to introduce a dependency into your project (and re-inventing an RDBMS would be crazy), you don't want it to be a black box.