I am working on a network project and already using mysql as my backend. The project is coded in c++. When my data becomes pretty large, its takes a lot of time to retrieve data from mysql. Hence I was exploring other databases and came across neo4j. After reading a lot of stuff on internet about neo4j and I have few queries. My core requirement of my project is high performance and availability which I am not getting when my database becomes huge.
My questions:
I am a little hesitant in using neo4j since I have read on internet at places that it does not perform better than mysql. Is it true?
There are no c++ neo4j drivers and can be accessed only via rest apis. Will it make my project even slower as it will be now a http request and response?
can we run neo4j on solaris as my server for the project will be solaris?
Disclaimer: My answer might be biased since I'm working for Neo Technology. Nevertheless I will stay as objective as possible.
Regarding your questions:
It totally depends on your use case if a graph database are a relational database performs better. A graph database excels when you run local queries (e.g. "which are the friends of my friends). By local query I'm referring to a case where you start at one or more bound nodes and then traverse through the graph. For global queries (e.g. "what is the average age of people in the db") a graph database can perform at the same level a relational but will not be significantly faster. However, if your global queries need to do a lot of traversals, the benefit of a graph database will also be significant.
No, using your language's HTTP capabilities will not be slower compared to using a driver. Most of the drivers add some convenience layer(s) for creating the request and parsing the response, and maybe some caching.
Neo4j as a JVM based database can run on any JVM 7 enabled platform. However Neo Technology's support offering currently covers Linux, Windows and HP-UX. If you need commercial grade support for Solaris please get in touch with me or my colleagues directly.
Related
I want to set up a teamspeak 3 server. I can choose between SQLite and MySQL as database. Well I usually tend to "do not use SQLite in production". But on the other hand, it's a teamspeak server. Well okay, just let me google this... I found this:
Speed
SQLite3 is much faster than MySQL database. It's because file database is always faster than unix socket. When I requested edit of channel it took about 0.5-1 sec on MySQL database (127.0.0.1) and almost instantly (0.1 sec) on SQLite 3. [...]
http://forum.teamspeak.com/showthread.php/77126-SQLite-vs-MySQL-Answer-is-here
I don't want to start a SQLite vs MySQL debate. I just want to ask: Is his argument even valid? I can't imagine it's true what he says. But unfortunately I'm not expert enough to answer this question myself.
Maybe TeamSpeak dev's have some major differences in their db architecture between SQLite and MySQL which explains a huge difference in speed (I can't imagine this).
At First Access Time will Appear Faster in SQLite
The access time for SQLite will appear faster at first instance, but this is with a small number of users online. SQLite uses a very simplistic access algorithm, its fast but does not handle concurrency.
As the database starts to grow, and the amount of simultaneous access it will start to suffer. The way servers handle multiple requests is completely different and way more complex and optimized for high concurrency. For example, SQLite will lock the whole table if an update is going on, and queue the orders.
RDBMS's Makes a lot of extra work that make them more Scalable
MySQL for example, even with a single user will create an access QUEUE, lock tables partially instead of allowing only single user-per time executions, and other pretty complex tasks in order to make sure the database is still accessible for any other simultaneous access.
This will make a single user connection slower, but pays off in the future, when 100's of users are online, and in this case, the simple
"LOCK THE WHOLE TABLE AND EXECUTE A SINGLE QUERY EACH TIME"
procedure of SQLite will hog the server.
SQLite is made for simplicity and Self Contained Database Applications.
If you are expecting to have 10 simultaneous access writing at the database at a time SQLite may perform well, but you won't want an 100 user application that constant writes and reads data to the database using SQLite. It wasn't designed for such scenario, and it will trash resources.
Considering your TeamSpeak scenario you are likely to be ok with SQLite, even for some business it is OK, some websites need databases that will be read only unless when adding new content.
For this kind of uses SQLite is a cheap, easy to implement, self contained, perfect solution that will get the job done.
The relevant difference is that SQLite uses a much simpler locking algorithm (a simple global database lock).
Using fine-grained locking (as MySQL and most other DB servers do) is much more complex, and slower if there is only a single database user, but required if you want to allow more concurrency.
I have not personally tested SQLite vs MySQL, but it is easy to find examples on the web that say the opposite (for instance). You do ask a question that is not quite so religious: is that argument valid?
First, the essence of the argument is somewhat specious. A Unix socket would be used to communicate to a database server. A "file database" seems to refer to the fact that communication is through a compiled-in interface. In the terminology of SQLite, it is server-less. Most databases store data in files, so the terminology "file database" is a little misleading.
Performance of a database involves multiple factors, such as:
Communication of query to the database.
Speed of compilation (ability to store pre-compiled queries is a plus here).
Speed of processing.
Ability to handle complex processing.
Compiler optimizations and execution engine algorithms.
Communication of results back to the application.
Having the interface be compiled-in affects the first and last of these. There is nothing that prevents a server-less database from excelling at the rest. However, database servers are typically millions of lines of code -- much larger than SQLite. A lot of this supports extra functionality. Some of it supports improved optimizations and better algorithms.
As with most performance questions, the answer is to test the systems yourself on your data in your environment. Being server-less is not an automatic performance gain. Having a server doesn't make a database "better". They are different applications designed for different optimization points.
In short:
For Local application databses, single user applications, and little simple projects keeping small data SQLite is winner.
For Network database applications, multiuser and concurrency, load balancing and growing data managements, security and roll based authentications, big projects and widely used services you should choose MySql.
In your question I do not know much about teamspeak servers and what kind of data it actually needs to keep in its database but if it just needs a local DBMS and not needs to proccess lots of concurrency and managements SQLite will be my choice.
all.
I'm using DBExpress and C++ Builder(Delphi) 2007 and MySQL, firebird , ...
I'd like to make win 32 application which use Database(located on my web server).
I tried using DBExpress (TSQLConnection for MySQL), it's so so slow...
and I tried local database then upload/download using Indy..
but it was not good and little complicated.
So what is the base way to use web-based database for win 32 application?
Do you have any experience? or any document or any comment will be so so graceful..
thanks a lot..
Database connections via an Internet link (using a VPN or not) are slow - you are perfectly right. The main reason IMHO is the "ping" delay of every request, which is very low on a local network, and much higher via Internet. So direct connection is not a good idea.
In latest versions of Delphi, you have the DataSnap components, which is the new "standard" (or Embarcadero recommended) way of doing remote access (including web access). Even if it was found at first to be a bit limited, the latest versions are perfectly usable, and are becoming a key product for cross-platform application building with Delphi. But it is not available for Delphi 2007.
One much matured product (and available for Delphi 2007) is Data Abstract:
Data Abstract is a framework for building database-driven applications
using the multi-tier data access model, for a variety of platforms.
Of course, this is not free, but this is a proven and efficient solution.
You may also take a look at our Client-Server ORM, which can connect to any DB, and is able to implement a RESTful SOA architecture with Delphi 2007, even without using the ORM part - that is, you can use your existing DBExpress-based source code, and expose easily some web interfaces to the data. It is Open Source, and uses JSON as communication format over a secured authentication mechanism. There is a lot of documentation included (more than 700 pages of PDF), which also tries to introduce to the SOA world.
Take a look at Datasnap: info
You need a data access library, which offers features:
Thread safety. In general, you will need to use a dedicated connection for each thread.
Connection pooling. To make connection creation (what is needed for (1)) fast, there must be a connection pool.
Fast execute SQL command, open result set, fetch capabilities.
Tracing. With any one library you may run into performance issues. You need a tool to see what is going on wrong. For that you will need to see and analyze the client and server communication.
Result set caching and ability to read it simultaneously from different threads. You may have few read-only tables, which you will fetch once and cache in your application. But you will need a machanism to read this data from threads. Kind of InMemTable cloning.
My answer is biased, but you may consider AnyDAC. It has all these and many other features.
PS: dbExpress should work too. Try to find first the reason for your performance issue, and not a different library. Because the same may happen with other library ...
DB applications over a slow link need a different approach than those using a fast link. You have to be careful about how much data you move around, and about how many roundtrips your application perform.
Usually an approach when the needed subset is cached on the client, modified, and the applied to the database is preferrable (of course if changes do not neeed to be seen immediately, and the chances of conflicts are low).
No middleware will help you much if the application is not designed with handling a slow link in mind.
I come from the cliche land of PHP and MySQL on Dreamhost. BUT! I am also a javascript jenie and I've been dying to get on the Node.js train. In my reading I've discovered inadvertently a NoSQL solution called Redis!
With my shared web host and limited server experience (I know how to install Linux on one of my old dell's and do some basic server admin) how can I get started using Redis and Node.js? and the next best question is -- what does one even use Redis for? What situation would Redis be better suited than MySQL? And does Node.js remove the necessity for Apache? If so why do developers recommend using NGINX server?
Lots of questions but there doesnt seem to be a solid source out there with this info all in one place!
Thanks again for your guidance and feedback!
NoSQL is just an inadequate buzz word.
I'll attempt to answer the latter part of the question.
Redis is a key-value store database system. Speed is its primary objective, so most of its use comes from event driven implementations (as it goes over in its reddit tutorial).
It excels at areas like logging, message transactions, and other reactive processes.
Node.js on the other hand is mainly for independent HTTP transactions. It is basically used to serve content (much like a web server, but Node.js really wouldn't be necessarily public facing) very fast which makes it useful for backend business logic applications.
For example, having a C program calculate stock values and having Node.js serve the content for another internal application to retrieve or using Node.js to serve a web page one is developing so one's coworkers can view it internally.
It really excels as a middleman between applications.
Redis
Redis is an in-memory datastore : All your data are stored in the memory meaning that a huge database means huge memory usage, but with really fast access and lookup.
It is also a key-value store : You don't have any realtionships, or queries to retrieve your data. You can only set a key value pair, and retreive it by its id. (Redis also provides useful types such as sets and hashes).
These particularities makes Redis really well suited for storing sessions in a web application, creating indexes on a database, handling real-time data like analytics.
So if you need something that will "replace" MySQL for storing your basic application models I suggest you try something like MongoDB, Riak or CouchDB that are document store.
Document stores manages your data as something analogous to JSON objects (I know it's a huge shortcut).
Read this article if you want to know more about popular nosql databases.
Node.js
Node.js provides asynchrous I/O for the V8 JavaScript engine.
When you run a node server, it listens on a port on your machine (e.g. 3000). It does not do any sort of Domain name resolution and Virtual Host handling so you have to use a http server with a proxy such as Apache or nginx.
Choosing over nginx in production is a matter of performance, and I find it easier to use. But I suggest you use the one you're the most comfortable with.
To get started with it just install them and start playing with it. HowToNode
You can get a free plan from https://redistogo.com/ - it is a hosted redis database instance.
Quick intro to redis data types and basic commands is available here - http://redis.io/topics/data-types-intro.
A good comparison of when to use what is here - http://playbook.thoughtbot.com/choosing-platforms/databases/
Our website needs a content management type system. For example, admins want to create promotion pages on the fly. They'll supply some text and images for the page and the url that the page needs to be on. We need a data store for this. The criteria for the data store are simple and defined below. I am not familiar with CouchDB or MongoDB, but think that they may be a better fit for this than MySQL, but am looking for someone with more knowledge of MongoDB and CouchDB to chime in.
On a scale of 1 to 10 how would you rate MongoDB, CouchDB, and MySQL for the following:
Java client
Track web clicks
CMS like system
Store uploaded files
Easy to setup failover
Support
Documentation
Which would you choose under these circumstances?
Each one is suitable for different usecases. But in low traffic sites mysql/postgresql is better.
Java client: all of them have clients
Track web clicks : mongo and cassandra is more suitable for this high write situation
Store uploaded files : mongo with gridfs is suitable. cassandra can store up to 2gb by each column splitted into 1 mb. mysql is not suitable. storing only file location and store the file in the filesystem is preffered for cassandra and mysql.
Easy to setup failover : cassandra is the best, mongo second
Support : all have good support, mysql has the largest community, mongo is second
Documentation : 1st mysql, 2nd mongo
I prefer MongoDB for analytics (web clicks, counters, logs) (you need a 64 bit system) and mysql or postgresql for main data. on the companies using mongo page in the mongo website, you can see most of them are using mongo for analytics. mongo can be suitable for main data after version 1.8. the problem with cassandra is it's poor querying capabilities (not suitable for a cms). and the problem with mysql is not as easy scalable & HA as cassandra & mongo and also mysql is slower especially on writes. I don't recommend couchdb, it's the slowest one.
my best
Serdar Irmak
Here are some quick answers based on my experience with Mongo.
Java client
Not sure, but it does exist and it is well supported. Lots of docs, even several POJO wrappers to make it easy.
Track web clicks
8 or 9. It's really easy to do both inserts and updates thanks to "fire and forget". MongoDB has built-in tools to map-reduce the data and easy tools to export the data to SQL for analysis (if Mongo isn't good enough).
CMS like system
8 or 9. It's easy to store the whole web page content. It's really easy to "hook on" extra columns. This is really Mongo's "bread and butter".
Store uploaded files
There's a learning curve here, but Mongo has a GridFS system designed specifically for both saving and serving binary data.
Easy to set up failover
Start your primary server: ./mongo --bindip 1.2.3.4 --dbpath /my/data/files --master
Start your slave: ./mongo --bindip 1.2.3.5 --dbpath /my/data/files --slave --source 1.2.3.4
Support
10gen has a mailing list: http://groups.google.com/group/mongodb-user. They also have paid support.
Their response time generally ranks somewhere between excellent and awesome.
Documentation
Average. It's all there, but it is still a little dis-organized. Chock it up to a lot of new development in the last.
My take on CouchDB:
Java Client: Is great, use ektorp which is pretty easy and complete object mapping. Anyway all the API is just Json over HTTP so it is all easy.
Track web clicks: Maybe redis is a better tool for this. CouchDB is not the better option here.
CMS like system: It is great as you can easly combine templates, dynamic forms, data and etc and collate them using views.
Store uploaded files: Any document in couchdb can have arbitary attachments so it's a natural fit.
Easy to setup failover: Master/master replication make sure you are always read to go, database never gets corrupts so in case of failure it's only a matter of start couch again and it will take over where it stop (minimal downtime) and replication will catch the changes.
Support: Have a mailing list and paid support.
Documentation: use the open book http://guide.couchdb.org and wiki.
I think there are plenty of other posts related to this topic. However, I'll chime in since I've moved off mysql and onto mongodb. It's fast, very fast but that doesn't mean it's perfect. My advice, use what you're comfortable with. If it takes you longer to refactor code in order to make it fit with mongo or couch, then stick to mysql if that's what you're familiar with. If this is something you want to pick up as a skillset then by all means learn mongodb or couchdb.
For me, I went with mongodb for couple of reasons, file storage via gridfs and geolocation. Yea I could've used mysql but I wanted to see what all the fuss was about. I must say, I'm impress and I still have ways to go before I can say I'm comfortable with mongo.
With what you've listed, I can tell you that mongo will fit most of your needs.
I don't see anything here like "must handle millions of req/s" that would indicate rolling your own would be better than using something off the shelf like Drupal.
I have a prickly design issue regarding the choice of database technologies to use for a group of new applications. The final suite of applications would have the following database requirements...
Central databases (more than one database) using mysql (must be mysql due to justhost.com).
An application to be written which accesses the multiple mysql databases on the web host. This application will also write to local serverless database (sqlite/firebird/vistadb/whatever).
Different flavors of this application will be created for windows (.NET), windows mobile, android if possible, iphone if possible.
So, the design task is to minimise the quantity of code to achieve this. This is going to be tricky since the languages used are already c# / java (android) and objc (iphone). Not too worried about that, but can the work required to implement the various database access layers be minimised?
The serverless database will hold similar data to the mysql server, so some kind of inheritance in the DAL would be useful.
Looking at hibernate/nhibernate and there is linq to whatever. So many choices!
Get a better host. Seriously - SQL Server hosts don't cost that much more. An hour development time possibly per month - and that is already non-conervative.
Otherwise - throw out stuff you do not need. Neutralize languages to one. If that is an internet access stuff, check out OData for exposing data - nice nidependant protocol
The resit sis architecture. and LINQ (2Sql) sucks - compared to nhibernate ;)
but can the database access layer be reused?
Yes, it can be, but you have to carefully create a loosely coupled datalayer with no dependency on other parts.