I want to set up a teamspeak 3 server. I can choose between SQLite and MySQL as database. Well I usually tend to "do not use SQLite in production". But on the other hand, it's a teamspeak server. Well okay, just let me google this... I found this:
Speed
SQLite3 is much faster than MySQL database. It's because file database is always faster than unix socket. When I requested edit of channel it took about 0.5-1 sec on MySQL database (127.0.0.1) and almost instantly (0.1 sec) on SQLite 3. [...]
http://forum.teamspeak.com/showthread.php/77126-SQLite-vs-MySQL-Answer-is-here
I don't want to start a SQLite vs MySQL debate. I just want to ask: Is his argument even valid? I can't imagine it's true what he says. But unfortunately I'm not expert enough to answer this question myself.
Maybe TeamSpeak dev's have some major differences in their db architecture between SQLite and MySQL which explains a huge difference in speed (I can't imagine this).
At First Access Time will Appear Faster in SQLite
The access time for SQLite will appear faster at first instance, but this is with a small number of users online. SQLite uses a very simplistic access algorithm, its fast but does not handle concurrency.
As the database starts to grow, and the amount of simultaneous access it will start to suffer. The way servers handle multiple requests is completely different and way more complex and optimized for high concurrency. For example, SQLite will lock the whole table if an update is going on, and queue the orders.
RDBMS's Makes a lot of extra work that make them more Scalable
MySQL for example, even with a single user will create an access QUEUE, lock tables partially instead of allowing only single user-per time executions, and other pretty complex tasks in order to make sure the database is still accessible for any other simultaneous access.
This will make a single user connection slower, but pays off in the future, when 100's of users are online, and in this case, the simple
"LOCK THE WHOLE TABLE AND EXECUTE A SINGLE QUERY EACH TIME"
procedure of SQLite will hog the server.
SQLite is made for simplicity and Self Contained Database Applications.
If you are expecting to have 10 simultaneous access writing at the database at a time SQLite may perform well, but you won't want an 100 user application that constant writes and reads data to the database using SQLite. It wasn't designed for such scenario, and it will trash resources.
Considering your TeamSpeak scenario you are likely to be ok with SQLite, even for some business it is OK, some websites need databases that will be read only unless when adding new content.
For this kind of uses SQLite is a cheap, easy to implement, self contained, perfect solution that will get the job done.
The relevant difference is that SQLite uses a much simpler locking algorithm (a simple global database lock).
Using fine-grained locking (as MySQL and most other DB servers do) is much more complex, and slower if there is only a single database user, but required if you want to allow more concurrency.
I have not personally tested SQLite vs MySQL, but it is easy to find examples on the web that say the opposite (for instance). You do ask a question that is not quite so religious: is that argument valid?
First, the essence of the argument is somewhat specious. A Unix socket would be used to communicate to a database server. A "file database" seems to refer to the fact that communication is through a compiled-in interface. In the terminology of SQLite, it is server-less. Most databases store data in files, so the terminology "file database" is a little misleading.
Performance of a database involves multiple factors, such as:
Communication of query to the database.
Speed of compilation (ability to store pre-compiled queries is a plus here).
Speed of processing.
Ability to handle complex processing.
Compiler optimizations and execution engine algorithms.
Communication of results back to the application.
Having the interface be compiled-in affects the first and last of these. There is nothing that prevents a server-less database from excelling at the rest. However, database servers are typically millions of lines of code -- much larger than SQLite. A lot of this supports extra functionality. Some of it supports improved optimizations and better algorithms.
As with most performance questions, the answer is to test the systems yourself on your data in your environment. Being server-less is not an automatic performance gain. Having a server doesn't make a database "better". They are different applications designed for different optimization points.
In short:
For Local application databses, single user applications, and little simple projects keeping small data SQLite is winner.
For Network database applications, multiuser and concurrency, load balancing and growing data managements, security and roll based authentications, big projects and widely used services you should choose MySql.
In your question I do not know much about teamspeak servers and what kind of data it actually needs to keep in its database but if it just needs a local DBMS and not needs to proccess lots of concurrency and managements SQLite will be my choice.
Related
I am new to server side programming and am trying to understand relational databases a little better. Whenever I read about MYSQL vs SQLite people always talk about SQLite not being able to have multiple users. However, when I program with the Django Framework I am able to create multiple users on the sqlitedb. Can someone explain what people mean by multi-user? Thanks!
When people talk about multiple users in this context, they are talking about simultaneous connections to the database. The users in this case are threads in the web server that are accessing the database.
Different databases have different solutions for handling multiple connections working with the database at once. Generally reading is not a problem, as multiple reading operations can overlap without disturbing each other, but only one connection can write data in a specific unit at a a time.
The difference between concurrency for databases is basically how large units they lock when someone is writing. MySQL has an advanced system where records, blocks or tables can be locked depending on the need, while SQLite has a simpler system where it only locks the entire database.
The impact of this difference is seen when you have multiple threads in the webserver, where some threads want to read data and others want to write data. MySQL can read from one table and write into another at the same time without problem. SQLite has to suspend all incoming read requests whenever someone wants to write something, wait for all current reads to finish, do the write, and then open up for reading operations again.
As you can read here, sqlite supports multi users, but lock the whole db.
Sqlite is used for development ussualy, buy Mysql is a better sql for production, because it has a better support for concurrency access and write, but sqlite dont.
Hope helps
SQLite concurrency is explained in detail here.
In a nutshell, SQLite doesn't have the fine-grained concurrency mechanisms that MySQL does. When someone tries to write to a MySQL database, the MySQL database will only lock what it needs to lock, usually a single record, sometimes a table.
When a user writes to a SQLite database, the entire database file is momentarily locked. As you might imagine, this limits SQLite's ability to handle many concurrent users.
Multi-user means that many tasks (possibly on many separate computers) can have open connections to the database at the same time.
A multi-user database provides things like locks to allow these tasks to update the database safely.
Look at ScimoreDB. It's an embedded database that supports multi-process (or user) read and write access. It also can work as a client-server database.
I've seen pictures like this where multiple rails engines write to a single mySQL server.
1) Is this possible? Or does Rails want each application server to write to one database server?
2) If this is possible, how is it accomplished? Are there queues and a scheduler between the application servers and the write database server?
Scaling a mysql db is a pretty difficult thing to do, but its certainly been done plenty of times and there are a lot of best practices out there for you to take advantage of. The first thing you should know is that before you worry about scaling writes for a while yet, you probably need to scale your reads first.
Scaling reads can be done fairly easily using replication. There are several tools out there that make managing replication a lot easier such as Amazon RDS. Generally speaking many web severs can connect to many databases (as suggested by others), however you quickly run into scale issues once you have a lot of traffic, connections or whatever other action you are performing that generates load on the server.
As replicated severs are read only, you need to manage which sever you connect to depending on the action you're performing. I.e. if you had a users table, when creating, updating or deleting users you need to use the "write" database (the primary "source" sever) but when reading the user table, you can use one of the read replicas. This reduces the load on the primary write sever (allowing it to deal with even more writes) and as you can have multiple read databases behind a load balancer, you can get away with this structure for a very long time and scale reads across tens of database severs before you'll hit any significant issues (however most apps get away with 1-3).
There are situations where you will need to use your write database for read actions (although you should avoid it as much as possible) as the read replicas can be slightly behind the write dbs due to latency in replicating the write db queries, however most of the time you should be able to code knowing that there is the possibility that the read db is delayed (i.e. queue actions a reasonable period of time such that the updates will propagate across all the read severs) and simply use one of your read dbs rather than the write db.
Beyond this the key items to work on are ensuring you have efficient indexes and applying other best practices around maintaining a sensible data structure. You might also want to consider having 3 distinct "groups" of database servers. I generally like to have write, read and "stats" db groups. The write group for create, update and delete operations (as well as select for update), the read for general read items that must return their results quickly, and stats for anything that is going to be under high load and that you do not rely on for a prompt response (this keeps heavy queries that are not time sensitive away from your read db that you need quick responses from for general reads)
Once you get into a situation where you can no longer buy larger hardware and you're near maxing out your write capacity, you'll need to look into sharding, however that will take a lot of traffic / data (so dont worry about it unless you've done all of the above already).
I need to improve a PHP-MySQL web application, which only uses MySQL for REPL operations (and some search functions). 99% of the applications that I worked with never used advanced MySQL features, like replication, cross-table constraints, locking etc.
To my understanding I should instead use SQLite.
Are there any practical benefits if I do this?
Will I see a significant (>100ms) speed boost?
Should I expect problems with tables with more than 1,000,000 rows?
There is no catch-all answer to that, but there is a main point to consider: A very good rule of thumb is, that the higher your degree of concurrency is, the more you'll profit from MySQL and vice versa.
This means that in a scenario where database requests never ever are concurrent, you might see a speedup by using SQlite, though I doubt it would be in the 100ms order of magnitude.
The reason behind this is (very roughly):
In a database server environment, such as MySQL, PostgreSQL, MS SQL, Oracle and friends, a dedicated process (or a group of processes) exclusively touch the database files - the important part being dedicated. This means, that concurrency issues can be resolved in-process.
In a file-based database, such as SQlite, MS Access (Jet Engine) and friends, multiple processes will touch the DB files without knowing of each other - this implies that concurrency issues have to be resolved by writing them to the DB or helper file(s). This is typically much slower and less robust. In exchange for that, the overhead of communication between the database client (the web app) and the database server (which is in-process) is nonexistent.
Edit
After comment I want to make it more clear, that I am talking of concurrent writes, not concurrent reads. Concurrent reads of an unchanging dataset is not a hard problem - it doesn't need any locking at all.
The principal advantage of SQLite is that it is a file-based relational database that uses SQL as its query language. Being file-based tremendously simplifies deployment, making it very good for the case where an application needs a little database but must be run in an environment where having a database server would be problematic. (For example, many browsers use SQLite to manage their cookie stores; using a database server for that problem would be verging on the insane in many ways.)
The principal advantage of MySQL (with a sane table type) is that it is a database server that uses SQL as its query language. Being server-based allows for many features that a file-based system can't handle simply (such as replication) but does make things quite a bit more complex to deploy.
Whether the benefits of the additional complexity of a database server (e.g., MySQL) outweigh the costs (relative to a file-based database engine like SQLite) depends on a great many factors, notably including how many installations are expected and who is expected to perform those installations.
I am working with large datasets (10s of millions of records, at times, 100s of millions), and want to use a database program that links well with R. I am trying to decide between mysql and sqlite. The data is static, but there are lot of queries that I need to do.
In this link to sqlite help, it states that:
"With the default page size of 1024 bytes, an SQLite database is limited in size to 2 terabytes (241 bytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes."
I'm not sure what this means. When I have experimented with mysql and sqlite, it seems that mysql is faster, but I haven't constructed very rigorous speed tests. I'm wondering if mysql is a better choice for me than sqlite due to the size of my dataset. The description above seems to suggest that this might be the case, but my data is no where near 2TB.
I'd appreciate any insights into understanding this constraint of maximum file size from the filesystem and how this could affect speed for indexing tables and running queries. This could really help me in my decision of which database to use for my analysis.
The SQLite database engine stores the entire database into a single file. This may not be very efficient for incredibly large files (SQLite's limit is 2TB, as you've found in the help). In addition, SQLite is limited to one user at a time. If your application is web based or might end up being multi-threaded (like an AsyncTask on Android), mysql is probably the way to go.
Personally, since you've done tests and mysql is faster, I'd just go with mysql. It will be more scalable going into the future and will allow you to do more.
I'm not sure what this means. When I have experimented with mysql and sqlite, it seems that mysql is faster, but I haven't constructed very rigorous speed tests.
The short short version is:
If your app needs to fit on a phone or some other embedded system, use SQLite. That's what it was designed for.
If your app might ever need more than one concurrent connection, do not use SQLite. Use PostgreSQL, MySQL with InnoDB, etc.
It seems that (in R, at least), that SQLite is awesome for ad hoc analysis. With the RSQLite or sqldf packages it is really easy to load data and get started. But for data that you'll use over and over again, it seems to me that MySQL (or SQL Server) is the way to go because it offers a lot more features in terms of modifying your database (e.g., adding or changing keys).
SQL if you are mainly using this as a web service.
SQLite, if you want it to able to function offline.
SQLite generally is much much faster, as majority (or ALL) of data/indexes will be cached in memory. However, in the case of SQLite. If the data is split up across multiple tables, or even multiple SQLite database files, from my experience so far. For even millions of records (i yet to have 100's of millions though), it is far more effective then SQL (compensate the latency / etc). However that is when the records are split apart in differant tables, and queries are specific to such tables (dun query all tables).
An example would be a item database used in a simple game. While this may not sound much, a UID would be issued for even variations. So the generator soon quickly work out to more then a million set of 'stats' with variations. However this was mainly due to each 1000 sets of records being split among different tables. (as we mainly pull records via its UID). Though the performance of splitting was not properly measured. We were getting queries that were easily 10 times faster then SQL (Mainly due to network latency).
Amusingly though, we ended up reducing the database to a few 1000 entries, having item [pre-fix] / [suf-fix] determine the variations. (Like diablo, only that it was hidden). Which proved to be much faster at the end of the day.
On a side note though, my case was mainly due to the queries being lined up one after another (waiting for the one before it). If however, you are able to do multiple connections / queries to the server at the same time. The performance drop in SQL, is more then compensated, from your client side. Assuming this queries do not branch / interact with one another (eg. if got result query this, else that)
I want to import data from a MySQL server into Oracle database, and I found a suggestion to use Oracle database link. The Oracle instance is 10.0.2.1, and the MySQL server instance should be 5.1. The connection between two servers and the hard-disk should not be bottle neck.
I want to ask about the performance of Oracle database link? How fast it is? Is it very slow, slow or fast? Is it capable of transferring 1000 rows/second?
Thank you
1000 rows/sec is definitely acheivable... the question is whether it's acheivable on your database/network infrastructure.
Even if we had a detailed knowledge of your infrastructure it would still be very hard to say... it depends on so many factors like network speed, network latency, the size of the database rows being transfered etc.
The only way to tell for sure is to test it.
I would look on this as a good thing - the process of building the test is bound to teach you a lot about how it could work... it will throw up a number of issues that you're going to have to consider at some point - how do you handle backlogs when they form? What is the max through-put you can acheive? etc. You'll learn what kind of data-transfer works best for you (e.g. single rows at a time or larger batches) You might want to try it with a mechanisms other than SQL (e.g. queues)
You say that you don't think the network / hard disk access will be an issue - again, you need to test this assumption. Every database has a limiting factor on the performance somewhere (or they'd be infinitely fast!) and it's quite often disk access that is the limiting factor. In this case I would speculate that the network may be the limiting factor, but there's no way to know for sure without measuring it.
Generally speaking dblink performance limited by network speed, but there are some pitfalls, leading to performance issues:
unnecessary joins between local and remote tables that leads to transferring large amounts of data;
lack of parallelism built into the query (unions help in this case);
implicit sorting on remote database side;
failure to comply with Oracle recommendations such as using of collocated views and hints (mainly DRIVING_SITE and NO_MERGE).