Can sqlite3 handle 30 concurrent update requests gracefully? - mysql

We like the simplicity of sqlite3 but are concerned about its ability to handle concurrent updates gracefully. Our web app is for about 30 users (50 users maximum) who has rights to update and a number of web users (let's say 500 web users) who can only read the page. Those 30 (50) users likely will not do update simultaneously. Daily update to the db should be no more than 1000 updates (consider saving one db record into a table as ONE update) on regular base. The update activity most likely happens during the 9am-5pm work hour.
Since sqlite3 locks the whole db for update (not sure if it locks for read request), our question is that is sqlite3 powerful enough to handle the concurrent updates gracefully in our situation without showing the exception error.
Thanks so much.

I think you already have enough information about how SQLite works. So the answer to your question is "yes" it can handle. But the real question is what would be the performance? It depends on the frequency of updates/inserts to your database. Updates will lock and keep reads waiting.
Let's say the performance is acceptable and you use it. What if your database gets corrupted? Even most advanced DBMS systems can have corrupted data. There can be many reasons of this from server shutdown to bugs. If your SQLite gets corrupted, as far as I know it is harder to recover the database file.
I'd strongly suggest don't risk and use a non-embedded DBMS.

Related

What is the difference between MYSQL and SQLite multi-user functionality?

I am new to server side programming and am trying to understand relational databases a little better. Whenever I read about MYSQL vs SQLite people always talk about SQLite not being able to have multiple users. However, when I program with the Django Framework I am able to create multiple users on the sqlitedb. Can someone explain what people mean by multi-user? Thanks!
When people talk about multiple users in this context, they are talking about simultaneous connections to the database. The users in this case are threads in the web server that are accessing the database.
Different databases have different solutions for handling multiple connections working with the database at once. Generally reading is not a problem, as multiple reading operations can overlap without disturbing each other, but only one connection can write data in a specific unit at a a time.
The difference between concurrency for databases is basically how large units they lock when someone is writing. MySQL has an advanced system where records, blocks or tables can be locked depending on the need, while SQLite has a simpler system where it only locks the entire database.
The impact of this difference is seen when you have multiple threads in the webserver, where some threads want to read data and others want to write data. MySQL can read from one table and write into another at the same time without problem. SQLite has to suspend all incoming read requests whenever someone wants to write something, wait for all current reads to finish, do the write, and then open up for reading operations again.
As you can read here, sqlite supports multi users, but lock the whole db.
Sqlite is used for development ussualy, buy Mysql is a better sql for production, because it has a better support for concurrency access and write, but sqlite dont.
Hope helps
SQLite concurrency is explained in detail here.
In a nutshell, SQLite doesn't have the fine-grained concurrency mechanisms that MySQL does. When someone tries to write to a MySQL database, the MySQL database will only lock what it needs to lock, usually a single record, sometimes a table.
When a user writes to a SQLite database, the entire database file is momentarily locked. As you might imagine, this limits SQLite's ability to handle many concurrent users.
Multi-user means that many tasks (possibly on many separate computers) can have open connections to the database at the same time.
A multi-user database provides things like locks to allow these tasks to update the database safely.
Look at ScimoreDB. It's an embedded database that supports multi-process (or user) read and write access. It also can work as a client-server database.

Is it a good idea to wrap a data migration into a single transaction scope?

I'm doing a data migration at the moment of a subset of data from one database into another.
I'm writing a .net application that is going to communicate with our in house ORM which will drag data from the source database to the target database.
I was wondering, is it feasible, or is it even a good idea to put the entire process into a transaction scope and then if there are no problems to commit it.
I'd say I'd be moving possibly about 1Gig of data across.
Performance is not a problem but is there a limit on how much modified or new data that can be inside a transaction scope?
There's no limit other than the physical size of the log file (note the size required will be much more then the size of the migrated data. Also think about if there is an error and you rollback the transaction that may take a very, very long time.
If the original database is relatively small (< 10 gigs) then I would just make a backup and run the migration non-logged without a transaction.
If there are any issues just restore from back-up.
(I am assuming that you can take the database offline for this - doing migrations when live is a whole other ball of wax...)
If you need to do it while live then doing it in small batches within a transaction is the only way to go.
I assume you are copying data between different servers.
In answer to your question, there is no limit as such. However there are limiting factors which will affect whether this is a good idea. The primary one is locking and lock contention. I.e.:
If the server is in use for other queries, your long-running transaction will probably lock other users out.
Whereas, If the server is not in use, you don't need a transaction.
Other suggestions:
Consider writing the code so that it is incremental, and interruptable, i.e. does it a bit at a time, and will carry on from wherever it left off. This will involve lots of small transactions.
Consider loading the data into a temporary or staging table within the target database, then use a transaction when updating from that source, using a stored procedure or SQL batch. You should not have too much trouble putting that into a transaction because, being on the same server, it should be much, much quicker.
Also consider SSIS as an option. Actually, I know nothing about SSIS, but it is supposed to be good at this kind of stuff.

How many databases can MySQL handle?

My MySql server currently has 235 databases. Should I worry?
They all have same structure with MyISAM tables.
The hardware is a virtual machine with 2 GB RAM running on a Quad-Core AMD Opteron 2.2GHz.
Recently cPanel sent me an email saying that MySql has failed and a restart has been made.
New databases are being expected to be created and I wonder if I should add more memory or if I should simply add another virtual machine.
The "databases" in mysql are really catalogues, is has no effect on its limits whether you put all the tables in one or each in its own.
The main problem is the table cache. Without tuning it, you're going to have the default table cache (=64 typically), which means you will be closing a table every time you open one. This is incredibly bad.
Except in MyISAM, it's even worse, because closing a table throws its key blocks out of the key cache, which means subsequent index lookups or scans will be reading actual blocks from disc, which is horrible and slow and really needs to be avoided.
My advice is:
If possible, immediately increase the table cache to > the total number of tables
Monitor the global status variable Opened_Tables in your monitoring; if it increases rapidly, this is bad.
Carry out performance and robustness testing on your the same hardware in a non-production environment (if you are not doing so already).
(reposting my comment for better visibility)
Thank you all for your comments. The system is something similar with Google Analytics. Users website's visits are being logged into a "master" table. A native application is monitoring the master table and processes the registered visits and writes them to users' database. Each user has its own DB. This has been decided for sharding. Various reports and statistics are being run for each user. And it is faster if it only runs on specific DB (less data) I know this is not the best setup. But we have to deal with it for a while.
I dont believe there is a hard limit, the only thing that's really limiting you will be your hardware and the traffic these databases will be getting.
You seem to have very little memory, which probably means you dont have massive numbers of connections...
You should start by profiling usage for each database (or set of databases, depending on how they are used of course).
My suggestion - MySQL (or any database server for that matter) could use more memory. You can never have enough.
You are doing it wrong.
Comment with some specifics about your databases, and we can probably fill you in on where your design went wrong.

MySQL synchronization questions

I have a MySQL DB which manages users’ accounts data.
Each user can only query he’s own data.
I have a script that on initial login gets the user data and inserts it to the DB.
I scheduled a cron process which updates all users’ data every 4 hours.
Here are my questions regarding it:
(1) - Do I need to implement some kind of lock mechanism on the initial login script?
This script can be executed by large number of users simultaneously - but every
user has a dedicated place in the DB so it does not affect other DB rows.
(2) - Same question on the cron process, should I handle this scenario:
While the cron process updates user i data, user i tries to fetch his data
from the DB.
I mean does MySQL already support and handles this scenario?
Any help would be appreciated.
Thanks.
No, you don't need to lock the database, MySQL engine handles this task for you. If you would make your database engine by yourself, you would have to be sure, that nothing will get in the way or conflict with data update, but since you are running such a smart thing as MySQL, you don't need to worry about it.
While data is updated, all queries will stand in line, until update finishes.

MySQL tracking system

I have to implement a tracking system backed up by a MySQL database. The system will track many apps with at least 5 events tracked for each app (e.g. how many users clicked on link x, how many users visited page y). Some apps will have millions of users so a few thousand updates/second is not a far fetched assumption.
Another component of the system will have to compute some statistical info that should be update every minute. The system should also record past values of those statistical values.
The approach a friend of mine suggested was to log every event in a log table and have a cron job that runs every minute and computes the desired info and updates a stats table.
This sounds reasonable to me. Are there better alternatives?
Thanks.
I've logged to a mysql log table with a cron that crunches it.
I generally use innodb tables in my apps, but for the log table I did it as myisam and used insert DELAYED . . . queries.
Myisam doesn't provide all the goodies of innodb, but I believe it is slightly faster (for that reason).
The main thing you are worried about is database locking when your cron is running, but using "insert delayed" gets around that problem for the most part.
if your hits rate it too high for even insert delated into myisam table to handle, you may want to keep recent hits in memory (memcache can come in handy, or a custom daemon you can write) and process the hits from memory periodically into the database stats table (aggregated).
I would really recommend you to use an already existing log analyzer analyzing the already existing logs from your web server. One example is webalizer. Even better in my opinion is an external system such as google analytics. This works better since it will keep working with intermediate systems such as load balancers and caches in place.