Our company has a database of 17,000 entries. We have used MS Access for over 10 years for our various mailings. Is there something new and better out there? I'm not a techie, so keep in mind when answering. Our problems with Access are:
-no record of what was deleted,
-will not turn up a name in a search if cap's or punctuation
is not entered exactly,
-is complicated for us to understand the de-duping process.
- We'd like a more nimble program that we can access from more than one dedicated computer.
The only applications I know of that are comparable to Access are FileMaker Pro, and the database component of the Open Office suite. FM Pro is a full-fledged product and gets good marks for ease of use from non-technical users, while Base is much less robust and is not nearly as easy for creating an application.
All of the answers recommending different databases really completely miss the point here -- the original question is about a data store and application builder, not just the data store.
To the specific problems:
PROBLEM 1: no record of what was deleted
This is a design error, not a flaw in Access. There is no database that really keeps a record of what's deleted unless someone programs logging of deleted data.
But backing up a bit, if you are asking this question it suggest that you've got people deleting things that shouldn't be deleted. There are two solutions:
regular backups. That would mean you could restore data from the last backup and likely recover most of the lost data. You need regular backups with any database, so this is not really something that is specific to Access.
design your database so records are never deleted, just marked deleted and then hidden in data entry forms and reports, etc. This is much more complicated, but is very often the preferred solution, as it preserves all the data.
Problem #2: will not turn up a name in a search if cap's or punctuation is not entered exactly
There are two parts to this, one of which is understandable, and the other of which makes no sense.
punctuation -- databases are stupid. They can't tell that Mr. and Mister are the same thing, for instance. The solution to this is that for all data that needs to be entered in a regularized fashion, you use all possible methods to insure that the user can only enter valid choices. The most common control for this is a dropdown list (i.e., "combo box"), which limits the choices the user has to the ones offered in the list. It insures that all the data in the field conforms to a finite set of choices. There are other ways of maintaining data regularity and one of those involves normalization. That process avoids the issue of repeatedly storing, say, a company name in multiple records -- instead you'd store the companies in a different table and just link your records to a single company record (usually done with a combo box, too). There are other controls that can be used to help insure regularity of data entry, but that's the easiest.
capitalization -- this one makes no sense to me, as Access/Jet/ACE is completely case-insensitive. You'll have to explain more if you're looking for a solution to whatever problem you're encountering, as I can't conceive of a situation where you'd actually not find data because of differences in capitalization.
Problem #3: is complicated for us to understand the de-duping process
De-duping is a complicated process, because it's almost impossible for the computer to figure out which record among the candidates is the best one to keep. So, you want to make sure your database is designed so that it is impossible to accidentally introduce duplicate records. Indexing can help with this in certain kinds of situations, but when mailing lists are involved, you're dealing with people data which is almost impossible to model in a way where you have a unique natural key that will allow you to eliminate duplicates (this, too, is a very complicated topic).
So, you basically have to have a data entry process that checks the new record against the existing data and informs the user if there's a duplicate (or near match). I do this all the time in my apps where the users enter people -- I use an unbound form where they type in the information that is the bare minimum to create a new record (usually some combination of lastname, firstname, company and email), and then I present a list of possible matches. I do strict and loose matching and rank by closeness of the match, with the closer matches at the top of the list.
Then the user has to decide if there's a match, and is offered the opportunity to create the duplicate anyway (it's possible to have two people with the same name at the same company, of course), or instead to abandon adding the new record and instead go to one the existing records that was presented as a possible duplicate.
This leaves it up to the user to read what's onscreen and make the decision about what is and isn't a duplicate. But it maximizes the possibility of the user knowing about the dupes and never accidentally creating a duplicate record.
Problem #4: We'd like a more nimble program that we can access from more than one dedicated computer.
This one confuses me. Access is multi-user out of the box (and has been from the very beginning, nearly 20 years ago). There is no limitation whatsoever to a single computer. There are things you have to do to make it work, such as splitting your database into to parts, one part with just the data tables, and the other part with your forms and reports and such (and links to the data tables in the other file). Then you keep the back end data file on one of the computers that acts as a server, and give a copy of the front end (the reports, forms, etc.) to each user. This works very well, actually, and can easily support a couple of dozen users (or more, depending on what they are doing and how well your database is designed).
Basically, after all of this, I would tend to second #mwolfe02's answer, and agree with him that what you need is not a new database, but a database consultant who can design for you an application that will help you manage your mailing lists (and other needs) without you needing to get too deep into the weeds learning Access (or FileMaker or whatever). While it might seem more expensive up front, the end result should be a big productivity boost for all your users, as well as an application that will produce better output (because the data is cleaner and maintained better because of the improved data entry systems).
So, basically, you either need to spend money upfront on somebody with technical expertise who would design something that allows you to do better work (and more efficiently), or you need to invest time in upping your own technical skills. None of the alternatives to Access are going to resolve any of the issues you've raised without significant investment in interface design to further the goals you have (cleaner data, easier to find information, etc.).
At the risk of sounding snide, what you are really looking for is a consultant.
In the hands of a capable programmer, all of your issues with Access are easily handled. The problems you are having are not the result of using the wrong tool, but using that tool less than optimally.
Actually, if you are not a techie then Access is already the best tool for you. You will not find a more non-techie friendly way to build a data application from bottom to top.
That said, I'd say you have three options at this point:
Hire a competent database consultant to improve your application
Find commercial off-the-shelf (COTS) software that does what you need (I'm sure there are plenty of products to handle mailings; you'll need to research)
Learn about database normalization and building proper MS Access applications
If you can find a good program that does what you want then #2 above will maximize your Return on Investment (ROI). One caveat is that you'll need to convert all of your existing data, which may not be easy or even possible. Make sure you investigate that before you buy anything.
While it may be the most expensive option up-front, hiring a competent database consultant is probably your best option if you need a truly custom solution.
SQL Server sounds like a viable alternative to your scenario. If cost is a concern, you can always use SQL Server Express, which is free. Full blown SQL Server provides a lot more functionality that might not be needed right away. Express is a lot simpler as the number of features provided with it are much smaller. With either version though you will have centralized store for your data and the ability to allow all transactions to be recorded in the transaction log. Also, both have the ability to import data from an Access Database.
The newest version of SQL Server is 2008 R2
You probably want to take a look at modern databases. If you're into Microsoft-based products, start with SQL Server Express
EDIT: However, since I understand that you're not a programmer yourself, you'd probably be better off having someone experienced look into your technical problem more deeply, like the other answer suggests.
It sounds like you may want to consider a front-end for your existing Access data store. Microsoft has yet to replace Access per se, but they do have a new tool that is a lot lower on the programming totem pole than some other options. Check out Visual Studio Lightswitch - http://www.microsoft.com/visualstudio/en-us/lightswitch.
It's fairly new (still in beta) but showing potential. With it, just as with any Visual Studio project, you can connect to an MS Access datasource and design a front-end to interact with it. The plus-side here is programming requirements are much lower than with straight-up Visual Studio (read: Wizards).
Given that replacing your Access DB will require some font-end programming, you may look into VistaDB. It should allow your front end to be created in .NET with an XCopy database on the backend without requiring a server. One plus is that it retains SQL Server syntax, so if you do decide to move to SQL Server you'll be one step ahead.
(Since you're not a techie and may not understand my previous statement, you might pass my answer on to the consultant/programmer/database guy who is going to do the work for you.)
http://www.vistadb.net/
Related
I'm building up an online (paid) service used for business administration purposes. The database is structured like so:
I have a contacts table filled with persons, contact info and the like. Then I have a few other tables holding information about payments, agreements and appointments. Also statistics like how much money was transferred this month, how many hours worth of appointments this month and the like.
I'm using MySQL (but could also go for MSSQL or some other service if necessary) and I had no formal training in any programming language whatsoever (yet).
I'm building a WPF application for acces to this database. Also planning on building an app so users can access their data and plan new appointments and register payments on the go.
I'm going to go for a login system to verify their right to login and use my service.
My question is about how to structure this. I'm not an SQL expert nor have I had any formal training in SQL or any other programming language. What I do know though is that my client-side app is almost out of the alpha stage.
So far I have come up with two ways to structure this.
1. Users get a seperate database.
My original idea was to give each user a seperate database, this makes it easier to provide people with statistics. Also it makes it easier to spread the workload through multiple, seperate servers. People would login to a master/main server, where their login information is stored, fetch their server info and programatically be 'redirected' to their own database. Spreading these databases also make it easier to provide individual back-ups to users.
The down-side of this is the sheer quantity of databases I'd have to manage. I'm planning on ending up with hundreds of thousands of users. Let's just say I want the system to be able to provide to an infinite amount of users.
2. Everything is stored in one database.
It's also possible to store everything in one database. This would make the database structure somewhat more complicated (while it also makes the whole a lot simpler). I'd have to add 'AND consumer_ID='" + MyID + "' to every query. (Which ofcourse is possible) and add a few tables to handle statistics per user.
It would be simpler to provide every user with the same database updates. Maintenance would be easier.
The down-side of this is that it makes it harder to spread the workload to seperate servers, I'd have to build something to make it possible that seperate servers mirror each other. Also I'd have to make sure that the workload is automatically divided between the servers, instead of simply going for: Fill server with X databases, then new server, fill, new etc.
I'm not in the luxury of hiring someone with any SQL training.
The most important thing for me now is that the system can be easily maintained while still being safe and reliable. I'm an amateur developer, going to college next year. I don't want to spend 50% of my time maintaining the database.
I think I got the major part of the details you might need, if you need anymore please ask for them.
I thank you in advance :)
Just go with solution 2. The downside of spreading the workload to many servers is fullfilled by "partitioning", look here for a starting point: http://dev.mysql.com/doc/refman/5.1/en/partitioning-overview.html
Partitioning would allow you for example to put all information of a table containing even IDs for consumers on the one, all other on the second server. Or whatever you want...
But i wouldn't start that complicated: do you need that now? It burdens you (either way) with such a big additional overhead! You can also look into the NoSQL database world for solutions that can be spread to as many servers as you want with low effort. You loose SQL and it's ACID features in the most cases; if you need those NoSQL is not an option.
I am required to make a general schema of a huge database that I have never used.
The problem is that I do not know how/where could I start doing this because, not considering the size, I have no idea of what is each table for. I can guess some but there are the mayority of them in which generic name fields do not say anything to me.
Do you have some advice?what could I do?
There is no documentation about the database and the creators are not able to help me because they are in another company now.
Thank you very much in advanced.
This isn't going to be easy.
Start by gathering any documentation, notes, etc. that exist. Also, it'll greatly help to have a thorough understanding of the type of data being stored, and of the application. Keep ample notes of your discoveries, and build the documentation that should have been built before.
If your database contains declared foreign keys, you can start there, and at least get down the relationships between the tables. Keeping in mind that this may be incomplete. As #John Watson points out, if the relationships are declared, there are tools to do this for you.
Check for stored functions and procedures, including triggers. Though these are somewhat uncommon in MySQL databases. Triggers especially will often yield clues ("every update to table X inserts a new row to table Y" -> "table Y is probably a log or audit table").
Some of the tables are hopefully obvious, and if you know what is related to them, you may be able to start figuring out those related tables.
Hopefully you have access to application code, which you can grep and read to find clues. Access to a test environment which you can destroy repeatedly will be useful too ("what happens if I change this in the app, where does the database change?"; "what happens if I scramble these values?"; etc.). You can dump tables and use diff on them, provided you dump them ordered by primary or unique key.
Doing queries like SELECT DISTINCT foo FROM table can help you see what different things can be in a column.
If its possible to start from a mostly-empty database (e.g., minimal to get the app to run), you can observe what changes as you add data to the app. Much quicker to dump the database when its small. Same for diffing it, same for reading through the output. Some things are easier to understand in a tiny database, but some things are more difficult. When you have a huge dataset and a column is always 3, you can be much more confident it always is.
You can watch SQL traffic from the application(s) to get an idea of what tables and columns they access for each function, and how they join them. Watching SQL traffic can be done in application-specific ways (e.g., DBI trace) or server-specific ways (turn on the general query log) or with a packet tracer like Wireshark or tcpdump. Which is appropriate is going to depend on the environment you're working in. E.g., if you have to do this on a production system, you probably want Wireshark. If you are doing this in dev/test, the disadvantage of the MySQL query log is that all the apps may very well be mixed together, and if multiple people are hitting the apps it'll get confusing. The app-specific log probably won't suffer from this, but of course the app may not have that.
Keep in mind the various ways data can be stored. For example, all three of these could mean May 1, 1980:
1980-05-01 — As a DATE, TIMESTAMP, or text.
2444330.5 — Julian day (with time, specifies at midnight)
44360 — Modified Julian day
326001600 — UNIX timestamp (with time, specifies midnight) assuming local time is US Eastern Time (seconds since Jan 1 1970 UTC)
There may be things in the database which are denormalized, and some of them may be denormalized incorrectly. E.g., you may be wondering "why does this user have a first name Bob in one table, and a first name Joe in another?" and the answer is "data corruption".
There may be columns that aren't used. There may be entire tables that aren't used. Despite this, they may still have data from older versions of the app (or other, no-longer-in-use apps), queries run from the MySQL console, etc.
There may be things which aren't visible in the application anywhere, but are used. Their purpose may be completely non-obvious without knowledge of the algorithms implemented in the app(s). For example, a search function in an app may store all kinds of precomputed information about the documents to search and their connections. Worse, these tables may only be updated by batch jobs, so changing a document won't touch them (making you mistakenly believe they have nothing to do with documents). Then, you come in the next morning, and the table is mysteriously very different. Though, in the search case, a query log when running search will tell you.
Try using the free mySQL workbench (it's specific to mySQL).
I have reverse engineered databases this way and also ended up with great Entity Relationship Diagrams!
I've worked with SQL for 20 years and this product really is great (it's free, from the mysql folks themselves).
It can have occasional problems, crashes, etc. at least it did on Ubuntu10 but they've been relatively rare and far out-weighed by the benefits! It's also actively developed so bugs are actually fixed on an on-going basis.
Assuming that nobody bothered to declare foreign keys in the table definition, and the database belongs to an application which is in use, after grabbing the current schema, the next step for me would be to enable logging of all queries (hoping that the data does NOT use a trivial ORM like [x]hibernate) to identify joins and data semantics.
This perl script may be helpful.
how much work should we do in the database?
Ok I'm really confused as to exactly how much "work" should be done IN the database, and how much work had to be done instead at the application level?
I mean I'm not talking about obvious stuff like we should convert strings into SHA2 hashes at the application level instead of the database level..
But rather stuff that are more blur, including, but not limited to "should we retrieve the data for 4 column and do a uppercase/concatenation at the application level, or should we do those stuff at the database level and send the calculated result to the application level?
And if you could list any more other examples it would be great.
It really depends on what you need.
I like to do my business logic in the database, other people are religously against that.
You can use triggers and stored procedures/functions in SQL.
Links for MySQL:
http://dev.mysql.com/doc/refman/5.5/en/triggers.html
http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspx
http://dev.mysql.com/doc/refman/5.5/en/stored-routines.html
My reasons for doing business logic in triggers and stored proces
Note that I'm not talking about bending the database structure towards the business logic, I'm talking about putting the business logic in triggers and stored procedures.
It centralizes your logic, the database is a central place, everything has to go through it. If you have multiple insert/update/delete points in your app (or you have multiple apps) you'll need to do the checks multiple times, if you do it in the database you only have to do the checks in one place.
It simplifies the application e.g., you can just add a member, the database will figure out if the member is already known and take the appopriate action.
It hides the internals of your database from the application, if you do all your logic in the application you will need intricate knowledge of your database in the application. If you use database code (triggers/procs) to hide that, you don't need to know every database detail in your app.
It makes it easier to restucture your database If you have the logic in your database, you can just change a tablelayout, replace the old table with a blackhole table, put a trigger on that and let the trigger do the updates to the new table, your app does not even need to know the database has changed, this allows legacy apps to keep working unchanged, whilst new apps can use the improved database layout.
Some things are easier in SQL
Some things work faster in SQL
I don't like to use (lots of and/or complicated) SQL code in my application, I like to put SQL code in a stored procedure/function and try to only put simple queries in my application code, that way I can just write code that explains what I mean in my application and let the database layer do the heavy lifting.
Some people disagree strongly with this, but this approach works well for me and has simplified debugging and maintenance of my applications a lot.
Generally, its a good practice to expect only "Data" from the Database. Its upto Application(s), to apply Business/Domain Logic and make sense of the data retrieved. Its highly recommended to do the following things in the Application Layer:
1) Formatting Date
2) Applying Math functions, such as interpolation/extrapolation, etc
3) Dynamic sorting (based on columns)
However, situations sometime warrant few things to be done at the database level.
In my opinion application should use data and database should provide them and that should be clear separation of concerns. So database gives records sorted, ordered and filtered according to requested conditions but it is up to application to apply some business logic to that records and "convert" them into something meaningful to the user.
For example, in my previous company we worked on big application for work time calculations. One of obvious functionalities in this kind of application is tracking vacation days of employees - how many days employee has per year, how many he used, how many left, etc. Basically we could write some triggers and procedures that would update those columns automatically. So when employee had his vacation days approved amount of days he applied for is taken from his "vacation pool" and added to "vacation days used". Pretty easy stuff but we decided to make it explicit on application level and boy, very soon we were happy we did it that way. Application had to be labor law compliant and it quickly turned out that not for all employees vacation days are calculated equally and sometimes vacation day can be not so vacation day at all but that is beside the point. Had we put this "easy" operation in database we had to version our database with every little change to a vacation days related logic and that would lead us straight to hell in customer support field due to a fact that it was possible to update only application without a need to update database (except clear "breakthrough" moments where database structure was changed of course).
In my experience I've found that many applications start with a straight-forward set of tables and then and handful of stored procedures to provide basic functionality. This works very well; it usually yields high performance and is simple to understand, it also mitigates any need for a complex middle-tier.
However, applications grow. It's not unusual to see large data-driven applications with thousands of stored procedures. Throw triggers into the mix and you have an application which, for anybody other than the original developers (if they're still working on it), is very difficult to maintain.
I will put a word in for applications which place most logic in the database - they can work well when you have some good database developers and/or you have a legacy schema which cannot be changed. The reason I say this is that ORMs take much of the pain out of this part of application development when you let them control the schema (if not, you often need to do a lot of fiddling to get it working).
If I was designing a new application then I would usually opt for a schema which is dictated by my application domain (the design of which will be in code). I would normally let an ORM handle the mapping between the objects and the database. I would treat stored procedures as exceptions to the rule when it came to data access (reporting can be much easier in sprocs than trying to coax an ORM into producing a complex output efficiently).
The most important thing to remember though, is that there are no "best practices" when it comes to design. It is up to you the developer to weigh up the pros and cons of each option in the context of your design.
I haven't used Access since high school, years ago.
What kind of problem does it solve well, or even better than a web app backed by a real RDBMS?
Is it still actively developed? Or is it pretty dead to MS already?
What are its biggest limitations?
Update:
What resource shall one use to learn how to develop a MS Access solution for small business?
Thanks
First and foremost, Access IS a real RDBMS. What it is isn't is a client server RDBMS.
The only implications of this are that there is a throttle on the number of simultaneous connections and the security of the data needs careful thought.
Amongst other things, Access is also an IDE that uses VBA as its language.
This means that in Access you can write Front End apps that link to either a SQL Server back end, an Access back end, or a SharePoint back end. So it is one very versatile cookie.
It's limitations are:
Security: if you are using an Access Back End, take note that it doesn't have the built in security of a client server database. In any app, security is a function of the cost and the requisite secrecy of the data.
Number of simultaneous connections. if you are not careful, Access will struggle with more than 10 people trying to update data simultaneously. You can extend that, but you need to know what you are doing to guarantee results. to put a number to it, lets say 50 simultaneous connections.
Like most databases, it is liable to corruption.
NOTE: when referring to Access as a database, you should really be referring to the "database engine", JET or ACE, depending on the version and, for Access 2007+, dictated by the file format that you use. In other words, if you are storing data in Access tables, you are using either JET or ACE. However, if you are using LINKED TABLES, that are in, for example, SQL Server, then you are not, strictly speaking, using JET or ACE security for those tables.
Access SQL doesn't allow you to write stored procedures (you can write functions in VBA), in the sense that Access SQL only allows imperative statements as opposed to procedural statements (eg, control flow statements). You can introduce some "procedural code" using VBA functions, but this is very different to using SQL statements.
You backup the file itself. You can write code to do this at the click of a button.
Security is always a function of cost. If you have data that is worth more than 100,000 US$ (either in loss of rights or legal liabilities if it is stolen and you have not shown due diligence in protecting it), then Access is probably not the answer. 100,000 is an arbitrary figure. The precise figure will depend on whether the data is insurable and the consequences of it being lost or stolen.
Ie, if the value of the data is the driving concern, then definitely don't use Access as a Back End. Whether you use it as a Front End, is a matter of budget. For US$5000 I have written apps that are still running 10 years later. They now need to port the back end to SQL Server because the volume of sensitive data has grown.
Access, when used within the above constraints AND when used by a professional Access developer (rather than some disgruntled fool who thinks he should be using "cooler" technologies), will produce very sophisticated, sturdy and reliable applications at a 10th of the cost of other systems. In such scenarios, Access is a total NO BRAINER.
Anything else will cost more, take longer and will only be as good as the person who writes the code and designs the UI.
I have an application (the first one I ever built in Access) that has run without problems for 10 years. We have extended it massively. I have moved into ASP.NET MVC, but Access is where I hail from and I have seen it work well.
So in summary: the number of users is relevant and the value or liabilities implicit in the data are the other deciding factor.
If the number of simultaneous users is low and the value/implicit liabilities of the data is low, then the choice is definitely Access.
However, get yourself a good developer.
EDITS/CLARIFICATIONS:
The above answer, like all answers, was written in haste in the middle of a working day. Some statements were a bit glib and generic and not written with a suitable degree of precision... However, when the comments made by others are reasonable, the author of the answer should edit the post and clarify.
1/
Access is a holy trinity. It is an IDE for writing forms and reports and functions to use in your queries. It "includes" a database engine (JET/ACE). It provides a Visual Interface onto the database engine that allows you to design queries, set up relationships between tables, etc.
It is usually referred in its many roles as just Access, but precision does help to learn Access and get the most out of it.
2/
Access can't use stored procedures in the sense that Access SQL can only use imperative statements rather than the procedural ones (eg, control flow statements). There is a reason, I have always thought, for calling them stored PROCEDURES.
3/
Not every Access app costs exactly 100,000. Nor is the budget of an Access app equal to the value of the data. That is obvious. The idea I was trying to convey was that if the data is worth more than a sum that can be reasonably insured, then don't use Access. Is that figure 100,000? According to Luke Chung and Clint Covington, ex program manager for Access, yes, but don't take their word for it. It really just means "a lot of money".
I have written an app for Medical Charities that still runs 10 years later after an initial budget of 5000. They have probably invested another 20,000 over the years. That kind of app is the Access sweet spot.
It all depends really, I will give you a quick example that happened to me recently. At work they needed a small system to capture some records from a group of about 15 users and pass about 15% of those records to another team of about 5 or so to do additional tasks on those records. This was a one off project that was going to last about 4 months.
The official IT solution was of course a web app with a SQL server backend coming in at about £60,000. As they had no SQL server space available and the budget was very small I decided to go with an unbound access database using JET to store the data.
In this example access/JET was the right choice, now if this had been a long term system to support 500 users of course the web app would be the way to go. Its horses for courses at the end of the day and people should not let their prejudices effect their business decisions.
Ah. Never. Point. Too many limitations in general. Backups are problematic, stability CAN be problematic. Especially if you compare access (file share daabase) against web ap you are in for a world of pain pretty much in every scenario.
Access is usable for small single place db stuff (loading data before moving it off to a SQL Server) or a front end for SQL Server (i.e. access not actually storing any data). The later is also pretty much the direction MS is taking access to - a front end technology.
My knowledge is quite old now, but it always used to be very good for reports - very quick, powerful, and much easier than, e.g. Crystal Reports.
If you just want to hack something out quickly, it's probably a bit easier to do at least some kinds of applications with Access than a web front end with a SQL (or whatever) backend. It is still being developed (Access 2010 was release within the last month or two, if memory serves).
I haven't used the new version to say for sure, but the last time I did any looking, it seemed like new editions were mostly updating the look to go along with the latest version of office, cleaning up semi-obvious problems and bugs, but not a whole lot more than that. I wouldn't say it's dead, but I don't see much to indicate that it's really one of Microsoft's top priorities either.
Trying to pin down it's biggest limitations is hard. The "JET Red" storage engine it's based around doesn't scale well at all -- but it was never really intended to. Its basic design is intended to conflate the application with the data being stored, so it's relatively difficult to just treat it as raw data to be used for other purposes. I don't know if it's still the case, but at least at one time, the database format was also fairly fragile -- file corruption was semi-common, and in most cases about the only hope of recovery was a backup file (which meant, at best, losing everything that had happened since the last backup -- and some forms of corruption weren't immediately obvious, so corrupt backups sometimes happened as well).
It comes down to this: if one of the Wizards built into access can produce exactly what you want, or at least something really close, and you only ever need to support a few users with the result, it might be a reasonable choice in a few situations. If that doesn't (all) apply, there are almost certain to be better alternatives.
The Jet database engine used by Access is considered deprecated by Microsoft, though it is still supported. The limits of an .mdb database and the newer .accdb type are described here.
Link
Even SQL Server Express would be better in almost every case.
Someone with very limited knowledge of RDBMS/programming can still throw a quick frontend together in Access (Ideally using an external database), that's really the main use for it.
Let's say I want to build a gaming website and I have many game sections. They ALL have a lot of data that needs to be stored. Is it better to make one database with a table representing each game or have a database represent each section of the game? I'm pretty much expecting a "depends" kind of answer.
Managing 5 different databases is going to be a headache. I would suggest using one database with 5 different tables. Aside from anything else, I wouldn't be surprised to find you've got some common info between the 5 - e.g. user identity.
Note that your idea of "a lot of data" may well not be the same as the database's... databases are generally written to cope with huge globs of data.
Depends.
Just kidding. If this is one project and the data are in any way related to each other I would always opt for one database absent a specific and convincing reason for doing otherwise. Why? Because I can't ever remember thinking to myself "Boy, I sure wish it were harder to see that information."
While there is not enough information in your question to give a good answer, I would say that unless you foresee needing data from two games at the same time for the same user (or query), there is no reason to combine databases.
You should probably have a single database for anything common, and then create independent databases for anything unique. Databases, like code, tend to end up evolving in different directions for different applications. Keeping them together may lead you to break things or to be more conservative in your changes.
In addition, some databases are optimized, managed, and backed-up at a database level rather than a table level. Since they may have different performance characteristics and usage profiles, a one-size-fit-all solution may not be scalable.
If you use an ORM framework, you get access to multiple databases (almost) for free while still avoiding code replication. So unless you have joint queries, I don't think it's worth it to pay the risk of shared databases.
Of course, if you pay someone to host your databases, it may be cheaper to use a single database, but that's really a business question, not software.
If you do choose to use a single database, do yourself a favour and make sure the code for each game only knows about specific tables. It would make it easier for you to maintain things later or separate into multiple databases.
One database.
Most of the stuff you are reasonably going to want to store is going to be text, or primitive data types such as integers. You might fancy throwing your binary content into blobs, but that's a crazy plan on a media-heavy website when the web server will serve files over HTTP for free.
I pulled lead programming duties on a web-site for a major games publisher. We managed to cover a vast portion of their current and previous content, in three European languages.
At no point did we ever consider having multiple databases to store all of this, despite the fact that each title was replete with video and image resources.
I cannot imagine why a multiple database configuration would suit your needs here, either in development or outside of it. The amount of synchronisation you'll have to pull and capacity for error is immense. Trying to pull data that pertains to all of them from all of them will be a nightmare.
Every site-wide update you migrate will be n times as hard and error prone, where n is the number of databases you eventually plump for.
Seriously, one database - and that's about as far from your anticipated depends answer as you're going to get.
If the different games don't share any data it would make sense to use separate databases. On the other hand it would make sense to use one database if the structure of the games' data is the same--you would have to make changes in every game database separately otherwise.
Update: In case of doubt you should always use one database because it's easier to manage in the most cases. Just if you're sure that the applications are completely separate and have completely different structures you should use more databases. The only real advantage is more clarity.
Generally speaking, "one database per application" tends to be a good rule of thumb.
If you're building one site that has many sections for talking about different games (or different types of games), then that's a single application, so one database is likely the way to go. I'm not positive, but I think this is probably the situation you're asking about.
If, on the other hand, your "one site" is a battle.net-type matching service for a collection of five distinct games, then the site itself is one application and each of the five games is a separate application, so you'd probably want six databases since you have a total of six largely-independent applications. Again, though, my impression is that this is not the situation you're asking about.
If you are going to be storing the same data for each game, it would make sense to use 1 database to store all the information. There would be no sense in replicating table structures across different databases, likewise there would be no sense in creating 5 tables for 5 games if they are all storing the same information.
I'm not sure this is correct, but I think you want to do one database with 5 tables because (along with other reasons) of the alternative's impact on connection pooling (if, for example, you're using ADO.Net). In the ADO.Net connection pool, connections are keyed by the connection string, so with five different databases you might end up with 20 connections to each database instead of 100 connections to one database, which would potentially affect the flexibility of the allocation of connections.
If anybody knows better or has additional info, please add it here, as I'm not sure if what I'm saying is accurate.
What's your idea of "a lot of data"? The only reason that you'd need to split this across multiple databases is if you are trying to save some money with shared hosting (i.e. getting cheap shared hosts and splitting it across servers), or if you feel each database will be in the 500GB+ range and do not have access to appropriate storage.
Note that both of these reasons have nothing to do with architecture, and entirely based on monetary concerns during scaling.
But since you haven't created the site yet, you're putting the cart before the horse. It is very unlikely that a brand new site would use anywhere near this level of storage, so just create 1 database.
Some companies have single databases in the 1,000+ TB range ... there is basically no upper bound on database size.
The number of databases you want to create depends not on the number of your games, but on the data stored in the databases, or, better say, how do you exchange these data between the databases.
If it is export and import, then do separate databases.
If it is normal relationships (with foreign keys and cross-queries), then leave it in one database.
If the databases are not related to each other, then they are separate databases, of course.
In one of my projects, I distinguished between the internal and external data (which were stored in separate databases).
The difference was quite simple:
External database stored only the facts you cannot change or undo. That was phone calls, SMS messages and incoming payments in our case.
Internal database stored the things that are usually stored: users, passwords etc.
The external database used only the natural PRIMARY KEY's, that were the phone numbers, bank transaction id's etc.
The databases were given with completely different rights and exchanging data between them was a matter of import and export, not relationships.
This made sure that nothing would happen with actual data: it is easy to relink a payment to a user, but it's very hard to restore a payment if it's lost.
I can pass on my experience with a similar situation.
We had 4 "Common" databases and about 30 "Specific" databases, separated for the same space concerns. The downside is that the space concerns were just projecting dBase shortcomings onto SQL Server. We ended up with all these databases on SQL Server Enterprise that were well under the maximum size allowed by the Desktop edition.
From a database perspective with respect to separation of concerns, the 4 Common databases could've been 2. The 30 Specific databases could've been 3 (or even 1 with enough manipulation / generalization). It was inefficient code (both stored procs and data access layer code) and table schema that dictated the multitude of databases; in the end it had nothing at all to do with space.
I would consolidate as much as possible early and keep your design & implementation flexible enough to extract components if necessary. In short, plan for several databases but implement as one.
Remember, especially on web sites. If you have multiple databases, you often lose the performance benefits of query caching and connection pooling. Stick to one.
Defenitively, one database
One place I worked had many databases, a common one for the stuff all clients used and client specifc ones for customizing by client. What ended up happening was that since the clients asked for the changes, they woudl end up inthe client database instead of common and thus there would be 27 ways of doing essentially the same thing becasue there was no refactoring from client-specific to "hey this is something other clients will need to do as well" so let's put it in common. So one database tends to lead to less reinventing the wheel.
Security Model
If each game will have a distinct set of permissions/roles specific to that game, split it out.
Query Performance /Complexity
I'd suggest keeping them in a single database if you need to frequently query across the data between the games.
Scalability
Another consideration is your scalability plans. If the games get extremely popular, you might want to buy separate database hardware for each game. Separating them into different databases from the start would make that easier.
Data Size
The size of the data should not be a factor in this decision.
Just to add a little. When you have millions and millions of players in one game and your game is realtime and you have tens of thousand simultaneous players online and you have to at least keep some essential data as up-to-date in DB as possible (say, player's virtual money). Then you will want to separate tables into independent DBs even though they are all "connected".
It really depends. And scaling will be painful whatever you may try to do to avoid it being painful. But if you really expect A LOT of players and updates and data I would advise on thinking twice, thrice and more before settling on a "one DB for several projects" solution.
Yes it will be difficult to manage several DBs probably. But you will have to do this anyway.
Really depends :)..
Ask yourself these questions:
Could there be a resuability (users table) that I may want to think about?
Is it worth seperating these entities or are they pretty much the same?
Do any of these entities share specific events / needs?
Is it worth my time and effort to build 5 different database systems (remember if you are writing the games that would mean different connection strings and also present more security, etc).
Or you could create one database OnlineGames and have a table that stores the game name and a category:
PacMan Arcade
Zelda Role playing
etc etc..
It really depends on what your intentions are...