What do Repair and Compact operations do to an .MDB? Will they stop an application crashing? - ms-access

What do Repair and Compact operations do to an .MDB?
If these operations do not stop a 1GB+ .MDB backed VB application crashing, what other options are there?
Why would a large sized .MDB file cause an application to crash?

"What do compact and repair operations do to an MDB?"
First off, don't worry about repair. The fact that there are still commands that purport to do a standalone repair is a legacy of the old days. That behavior of that command was changed greatly starting with Jet 3.51, and has remained so since that. That is, a repair will never be performed unless Jet/ACE determines that it is necessary. When you do a compact, it will test whether a repair is needed and perform it before the compact.
So, what does it do?
A compact/repair rewrites the data file, elmininating any unused data pages, writing tables and indexes in contiguous data pages and flagging all saved QueryDefs for re-compilation the next time they are run. It also updates certain metadata for the tables, and other metadata and internal structures in the header of the file.
All databases have some form of "compact" operation because they are optimized for performance. Disk space is cheap, so instead of writing things in to use storage efficiently, they instead write to the first available space. Thus, in Jet/ACE, if you update a record, the record is written to the original data page only if the new data fits within the original data page. If not, the original data page is marked unused and the record is rewritten to an entirely new data page. Thus, the file can become internally fragmented, with used and unused data pages mixed in throughout the file.
A compact organizes everything neatly and gets rid of all the slack space. It also rewrites data tables in primary key order (Jet/ACE clusters on the PK, but that's the only index you can cluster on). Indexes are also rewritten at that point, since over time those become fragmented with use, also.
Compact is an operation that should be part of regular maintenance of any Jet/ACE file, but you shouldn't have to do it often. If you're experiencing regular significant bloat, then it suggests that you may be mis-using your back-end database by storing/deleting temporary data. If your app adds records and deletes them as part of its regular operations, then you have a design problem that's going to make your data file bloat regularly.
To fix that error, move the temp tables to a different standalone MDB/ACCDB so that the churn won't cause your main data file to bloat.
On another note not applicable in this context, front ends bload in different ways because of the nature of what's stored in them. Since this question is about an MDB/ACCDB used from VB, I'll not go into details, but suffice it to say that compacting a front end is something that's necessary during development, but only very seldom in production use. The only reason to compact a production front end is to update metadata and recompile queries stored in it.

It's always been that MDB files become slow and prone to corruption as they get over 1GB, but I've never known why - it's always been just a fact of life. I did some quick searching, I can't find any official, or even well-informed insider, explanations of why this size is correlated with MDB problems, but my experience has always been that MDB files become incredibly unreliable as you approach and exceed 1GB.
Here's the MS KB article about Repair and Compact, detailing what happens during that operation:
http://support.microsoft.com/kb/209769/EN-US/
The app probably crashes as the result of improper/unexpected data returned from a database query to an MDB that large - what error in particular do you get when your application crashes? Perhaps there's a way to catch the error and deal with it instead of just crashing the application.

If it is crashing a lot then you might want to try a decompile on the DB and/or making a new database and copying all the objects over to the new container.
Try the decompile first, to do that just add the /decompile flag to the startup options of your DB for example
“C:\Program Files\access\access.mdb” “C:\mydb.mdb” /decompile
Then compact, compile and then compact again
EDIT:
You cant do it without access being installed but if it is just storing data then a decompile will not do you any good. You can however look at jetcomp to help you with you compacting needs
support.microsoft.com/kb/273956

Related

Why is my Access 2007 database growing so much?

I work on a Win32 application that has developed a very strange problem of the database quietly growing until finally it reaches the 2 GB file size limit. We use ADO to connect to an Access 2007 database. The application has worked nicely for years with no such difficulty being observed. as you may imagine, when it reaches the 2 GB limit, the database becomes corrupt. I have quite a few customer databases now that were sent to us for repair--all around 2GB in size. once compacted, they come back to < 10 MB.
we see some database growth over time but never growth on that sort of scale.
I made a small database "checker" that adds up the contents of all fields in all records to give some idea how much actual data is present. having checked this new tool on databases that have recently been compacted, I think the tool is working correctly. all the bloated databases have not more than 10 MB of data each.
We don't compact the database at app start. It has seemed to me that because we don't delete large amounts of data, compacting the database isn't something we "should" need to do. customers with larger databases (there are some but they are on earlier versions).
Can you suggest how we could have a database that should be <10 MB could grow to 2 GB?
A few remarks about what our app does:
any restructuring is done using DAO when ADO does not have the database open.
we do use transactions in a few places
for convenience, certain records are convenient to delete and recreate instead of find/edit/delete. typically this operation involves 5-30 records, each about 8K per record. this only occurs when the user presses "Save".
there are other record types that are about 70 KB/record but we're not using delete/recreate with them.
we use a BLOB ("OLEObject") field to store binary data.
thank you for any insights you can offer.
MS Access files bloat very easily. They store a lot of history of transactions, as well as retaining size during record deletion.
When I write an application with an Access database I factor regular compaction into the design as it is the only way to keep the database in line.
Compacting on close can present issues (depending on the environment) such as users forcing the compact to abort because they want their computer to finish shutting down at the end of the day. Equally, compact on open can cause frustrating delays where the user would like to get into the program but cannot.
I normally try and organise for the compact to be done as a scheduled task on an always on PC such as a server. Please follow the link for more information: http://support.microsoft.com/kb/158937
thank you all for your help. found where it happened:
var
tbl:ADOX_TLB.Table;
cat:ADOX_TLB.Catalog;
prop:ADOX_TLB.Property_;
begin
cat:=ADOX_TLB.CoCatalog.Create;
cat.Set_ActiveConnection(con.ConnectionObject);
// database growth here
tbl:=cat.Tables.Item[sTableName];
prop:=tbl.Properties['ValidationText'];
Result:=prop.Value;
prop:=nil;
tbl:=nil;
cat:=nil;
end;
each time this function was called, the database grew by about 32KB.
i changed to do this function less often and do it with DAO instead of ADO.
So doing a little research, I came across a discussion about how MS Access files will grow until compacted, even when data is deleted. From this I infer that they are storing the transaction history within the file. This means that they will continue to grow with each access.
The solution is compaction. You apparently need to compact the database regularly. You may want to do this on application close, instead of launch if it takes to long.
Also note that this means multi-operation changes (such as the delete then reinsert modified value mentioned above) will likely cause the file to expand more quickly.

Is it a good idea to wrap a data migration into a single transaction scope?

I'm doing a data migration at the moment of a subset of data from one database into another.
I'm writing a .net application that is going to communicate with our in house ORM which will drag data from the source database to the target database.
I was wondering, is it feasible, or is it even a good idea to put the entire process into a transaction scope and then if there are no problems to commit it.
I'd say I'd be moving possibly about 1Gig of data across.
Performance is not a problem but is there a limit on how much modified or new data that can be inside a transaction scope?
There's no limit other than the physical size of the log file (note the size required will be much more then the size of the migrated data. Also think about if there is an error and you rollback the transaction that may take a very, very long time.
If the original database is relatively small (< 10 gigs) then I would just make a backup and run the migration non-logged without a transaction.
If there are any issues just restore from back-up.
(I am assuming that you can take the database offline for this - doing migrations when live is a whole other ball of wax...)
If you need to do it while live then doing it in small batches within a transaction is the only way to go.
I assume you are copying data between different servers.
In answer to your question, there is no limit as such. However there are limiting factors which will affect whether this is a good idea. The primary one is locking and lock contention. I.e.:
If the server is in use for other queries, your long-running transaction will probably lock other users out.
Whereas, If the server is not in use, you don't need a transaction.
Other suggestions:
Consider writing the code so that it is incremental, and interruptable, i.e. does it a bit at a time, and will carry on from wherever it left off. This will involve lots of small transactions.
Consider loading the data into a temporary or staging table within the target database, then use a transaction when updating from that source, using a stored procedure or SQL batch. You should not have too much trouble putting that into a transaction because, being on the same server, it should be much, much quicker.
Also consider SSIS as an option. Actually, I know nothing about SSIS, but it is supposed to be good at this kind of stuff.

How to compress an MS Access database

I have an .mdb file which is 70MB.
After deleting all records contained in the file, the size remains 70MB.
How do I make my .mdb file smaller?
Every database engine that has ever existed needs regular maintenance operations run on them to optimize data storage and to recover slack space. Back in xBase days, you ran a PACK command to remove deleted rows, for instance. On SQL Server, you run scripts to shrink the actual data files for the same reasons.
Why does every database engine do this?
Because it would be a huge performance hit if every write to the database had to rewrite the whole file in optimized order. Consider a database that stores each data table in a separate file. If a table has 10000 records, and you delete the 5000th record, to get rid of slack space, you'd have to rewrite the whole second half of the data file. Instead, every database uses some form of marking the space used as unused and discardable the next time the optimize operations are run on the data table.
Jet/ACE is no different in this regard than any other database engine and any application using a Jet/ACE database as a data store should have regular maintenance operations scheduled, including a backup and then a compact.
There are some issues with this in Jet/ACE that aren't present in server database engines. Specifically, you can't compact unless all users have closed their connections to the data file. In a server database, the users connect to the database engine's server-side process, and that server-side demon is the only "user" of the actual data files in which the data is stored. Thus, the server demon can decide when to perform the optimization and maintenance routines, since it's entirely in control of when the data files are in use or not.
One common problem with Access applications is that users will leave their application open on their computers and leave the office for the day, which means that when you run your compact operation, say at 2:00am, the file is still open and you can't run it (because compact replaces the original file). Most programmers of Access applications who encounter this problem will either tolerate the occasional failure of this kind of overnight maintenance (volume shadow copy still allows a backup of the file, though there's no guarantee that backup copy will be in a 100% internally consistent state), or they will engineer their Access applications to terminate at a time appropriate to allow overnight maintenance operations. I've done both, myself.
In non-Access applications, the same problem exists, but has to be tackled differently. For web applications, it's something of a problem, but in general, I'd say that any web app that churns the data enough that a compact would be needed is one for which a Jet/ACE data store is wholly inappropriate.
Now, on the subject of COMPACT ON CLOSE:
It should never be used by anyone.
Ever.
It's useless and downright dangerous when it actually kicks in.
It's useless because there's no properly-architected production environment in which users would ever be opening the back end -- if it's an Access app, it should be split, with users only ever opening the front end, and if it's a web app, users won't be interacting directly with the data file. So in both scenarios, nobody is ever going to trigger the COMPACT ON CLOSE, so you've wasted your time turning it on.
Secondly, even if somebody does occasionally trigger it, it's only going to work if that user is the only one with the database open. As I said above, it can't be compacted if there are other users with it open, so this isn't going to work, either -- COMPACT ON CLOSE can only run when the user triggering it has exclusive access.
But worst of all, COMPACT ON CLOSE is dangerous and if it does run can lead to actual data loss. This is because there are certain states an Jet/ACE datebase can be in wherein internal structures are out of whack, but the data is all still accessible. When the compact/repair operation is run in that state, data can potentially be lost. This is an extremely rare condition, but it is a very remote possibility.
The point is that COMPACT ON CLOSE is not conditional, and there is no prompt that asks you if you want to run it. You don't get a chance to do a backup before it runs, so if you have it turned on and it kicks in when your database is in that very rare state, you could lose data that you'd otherwise be able to recover if you did not run the compact operation.
So, in short, nobody with any understanding of Jet/ACE and compacting ever turns on COMPACT ON CLOSE.
For a single user, you can just compact as needed.
For a shared application, some kind of scheduled maintenance script is the best thing, usually running overnight on the file server. That script would make a backup of the file, then run the compact. It's quite a simple script to write in VBScript, and easily scheduled.
Last of all, if your application frequently deletes large numbers of records, in most cases that's an indication of a design error. Records that are added and deleted in regular production use are TEMPORARY DATA and don't belong in your main data file, both logically speaking and pragmatically speaking.
All of my production apps have a temp database as part of the architecture, and all temp tables are stored there. I never bother to compact the temp databases. If for some reason performance bogged down because of bloat within the temp database, I'd just copy a pristine empty copy of the temp database over top of the old one, since none of the data in there is anything other than temporary. This reduces churn and bloat in front end or back end and greatly reduces the frequency of necessary compacts on the back end data file.
On the question of how to compact, there are a number of options:
in the Access UI you can compact the currently open database (TOOLS | DATABASE UTILITIES). However, that doesn't allow you to make a backup as part of the process, and it's always a good idea to backup before compacting, just in case something goes wrong.
in the Access UI you can compact a database that is not open. This one compacts from an existing file to a new one, so when you're done you have to rename both the original and the newly compacted file (to have the new name). The FILE OPEN dialog that asks you what file to compact from does allow you to rename the file at that point, so you can do it as part of the manual process.
in code, you can use the DAO DBEngine.CompactDatabase method to do the job. This is usable from within Access VBA, or from a VBScript, or from any environment where you can use COM. You are responsible in your code for doing the backup and renaming files and so forth.
another option in code is JRO (Jet & Replication Objects), but it offers nothing in regard to compact operations that DAO doesn't already have. JRO was created as a separate library to handle Jet-specific features that were not supported in ADO itself, so if you're using ADO as your interface, the MS-recommended library for compacting would be JRO. From within Access, JRO is inappropriate for compact, as you'd already have the CompactDatabase method available, even if you don't have a DAO reference (the DBEngine is always available in Access whether or not you have a DAO reference). In other words, DBEngine.CompactDatabase can be used within Access without either a DAO or ADO reference, where as the JRO CompactDatabase method is only available with a JRO reference (or using late binding). From outside of Access, JRO may be the appropriate library.
Let me stress how important backups are. You won't need it 999 times out of 1000 (or even less often), but when you need it, you'll need it bad! So never compact without making a backup beforehand.
Finally, after any compact, it's a good idea to check the compacted file to see if there's a system table called MSysCompactErrors. This table will list any problems encountered during the compact, if there were any.
That's all I can think of regarding compact for now.
Open the mdb and do a 'Compact and Repair'. This will reduce the size of the mdb.
You can also set the 'Compact on Close' option to on (off by default).
Here is a link to some additional information:
http://www.trcb.com/computers-and-technology/data-recovery/ways-to-compact-and-repair-an-access-database-27384.htm
The Microsoft Access database engine provides a CompactDatabase method that makes a compact copy of the database file. The database file must be closed before calling CompactDatabase.
Documentation:
Pages on microsoft.com about "Compact and Repair Database"
DBEngine.CompactDatabase Method (DAO)
Here's a Python script that uses DAO to copy and compact MDB files:
import os.path
import sys
import win32com.client
# Access 97: DAO.DBEngine.35
# Access 2000/2003: DAO.DBEngine.36
# Access 2007: DAO.DBEngine.120
daoEngine = win32com.client.Dispatch('DAO.DBEngine.36')
if len(sys.argv) != 3:
print("Uses Microsoft DAO to copy the database file and compact it.")
print("Usage: %s DB_FILE FILE_TO_WRITE" % os.path.basename(sys.argv[0]))
sys.exit(2)
(src_db_path, dest_db_path) = sys.argv[1:]
print('Using database "%s", compacting to "%s"' % (src_db_path, dest_db_path))
daoEngine.CompactDatabase(src_db_path, dest_db_path)
print("Done")
With python you can compact with the pypyodbc libary (either .mdb or .accdb)
import pypyodbc
pypyodbc.win_compact_mdb('C:\\data\\database.accdb','C:\\data\\compacted.accdb')
(source)
Then you can copy compacted.accdb back to database.accdb with shutil:
import shutil
shutil.copy2('C:\\data\\compacted.accdb','C:\\data\\database.accdb')
(source)
Note: As far as I know for Access DB with ODBC, python and its libraries must be 32bit (link). Also, these steps probably only work with Windows OS.

Can splitting .MDB files into segments help with stability?

Is this a realistic solution to the problems associated with larger .mdb files:
split the large .mdb file into
smaller .mdb files
have one 'central' .mdb containing
links to the tables in the smaller
.mdb files
How easy would it be to make this change to an .mdb backed VB application?
Could the changes to the database be done so that there are no changes required to the front-end application?
Edit Start
The short answer is "No, it won't solve the problems of a large database."
You might be able to overcome the DB size limitation (~2GB) by using this trick, but I've never tested it.
Typically, with large MS Access databases, you run into problems with speed and data corruption.
Speed
Is it going to help with speed? You still have the same amount of data to query and search through, and the same algorithm. So all you are doing is adding the overhead of having to open up multiple files per query. So I would expect it to be slower.
You might be able to speed it up by reducing the time time that it takes to ge tthe information off of disk. You can do this in a few ways:
faster drives
put the MDB on a RAID (anecdotally RAID-1,0 may be faster)
split the MDB up (as you suggest) into multiple MDBs, and put them on separate drives (maybe even separate controllers).
(how well this would work in practice vs. theory, I can't tell you - if I was doing that much work, I'd still choose to switch DB engines)
Data Corruption
MS Access has a well deserved reputation for data corruption. To be fair, I haven't had it happen to me fore some time. This may be because I've learned not to use it for anything big; or it may be because MS has put a lot of work in trying to solve these problems; or more likely a combination of both.
The prime culprits in data corruption are:
Hardware: e.g., cosmic rays, electrical interference, iffy drives, iffy memory and iffy CPUs - I suspect MS Access does not have as good error handling/correcting as other Databases do.
Networks: lots of collisions on a saturated network can confuse MS Access and convince it to scramble important records; as can sub-optimally implemented network protocols. TCP/IP is good, but it's not invincible.
Software: As I said, MS has done a lot of work on MS Access over the years, if you are not up to date on your patches (MS Office and OS), get up to date. Problems typically happen when you hit extremes like the 2GB limit (some bugs are hard to test and won't manifest them selves except at the edge cases, which makes the less likely to have been seen or corrected, unless reported by a motivated user to MS).
All this is exacerbated with larger databases, because larger databases typically have more users and more workstations accessing it. Altogether the larger database and number of users multiply to provide more opportunity for corruption to happen.
Edit End
Your best bet would be to switch to something like MS SQL Server. You could start by migrating your data over, and then linking one MDB to to it. You get the stability of SQL server and most (if not all) of your code should still work.
Once you've done that, you can then start migrating your VB app(s) over to us SQL Server instead.
If you have more data than fits in a single MDB then you should get a different database engine.
One main issue that you should consider is that you can't enforce referential integrity between tables stored in different MDBs. That should be a show-stopper for any actual database.
If it's not, then you probably don't have a proper schema designed in the first place.
For reasons more adequately explained by CodeSlave the answer is No and you should switch to a proper relational database.
I'd like to add that this does not have to be SQL Server. Quite possibly the reason why you are reluctant to do this is one of cost, SQL Server being quite expensive to obtain and deploy if you are not in an educational or charitable organisation (when it's remarkably cheap and then usually a complete no-brainer).
I've recently had extremely good results moving an Access system from MDB to MySQL. At least 95% of the code functioned without modification, and of the remaining 5% most was straightforward with only a few limited areas where significant effort was required. If you have sloppy code (not closing connections or releasing objects) then you'll need to fix these, but generally I was remarkably surprised how painless this approach was. Certainly I would highly recommend that if the reason you are reluctant to move to a database backend is one of cost then you should not attempt to manipulate .mdb files and go instead for the more robust database solution.
Hmm well if the data is going through this central DB then there is still going to be a bottle neck in there. The only reason I can think why you would do this is to get around the size limit of an access mdb file.
Having said that if the business functions can be split off in the separate applications then that might be a good option with a central DB containing all the linked tables for reporting purposes. I have used this before to good effect

Why should I care about compacting an MS Access .mdb file?

We distribute an application that uses an MS Access .mdb file. Somebody has noticed that after opening the file in MS Access the file size shrinks a lot. That suggests that the file is a good candidate for compacting, but we don't supply the means for our users to do that.
So, my question is, does it matter? Do we care? What bad things can happen if our users never compact the database?
In addition to making your database smaller, it'll recompute the indexes on your tables and defragment your tables which can make access faster. It'll also find any inconsistencies that should never happen in your database, but might, due to bugs or crashes in Access.
It's not totally without risk though -- a bug in Access 2007 would occasionally delete your database during the process.
So it's generally a good thing to do, but pair it with a good backup routine. With the backup in place, you can also recover from any 'unrecoverable' compact and repair problems with a minimum of data loss.
Make sure you compact and repair the database regularly, especially if the database application experiences frequent record updates, deletions and insertions. Not only will this keep the size of the database file down to the minimum - which will help speed up database operations and network communications - it performs database housekeeping, too, which is of even greater benefit to the stability of your data. But before you compact the database, make sure that you make a backup of the file, just in case something goes wrong with the compaction.
Jet compacts a database to reorganize the content within the file so that each 4 KB "page" (2KB for Access 95/97) of space allotted for data, tables, or indexes is located in a contiguous area. Jet recovers the space from records marked as deleted and rewrites the records in each table in primary key order, like a clustered index. This will make your db's read/write ops faster.
Jet also updates the table statistics during compaction. This includes identifying the number of records in each table, which will allow Jet to use the most optimal method to scan for records, either by using the indexes or by using a full table scan when there are few records. After compaction, run each stored query so that Jet re-optimizes it using these updated table statistics, which can improve query performance.
Access 2000, 2002, 2003 and 2007 combine the compaction with a repair operation if it's needed. The repair process:
1 - Cleans up incomplete transactions
2 - Compares data in system tables with data in actual tables, queries and indexes and repairs the mistakes
3 - Repairs very simple data structure mistakes, such as lost pointers to multi-page records (which isn't always successful and is why "repair" doesn't always work to save a corrupted Access database)
4 - Replaces missing information about a VBA project's structure
5 - Replaces missing information needed to open a form, report and module
6 - Repairs simple object structure mistakes in forms, reports, and modules
The bad things that can happen if the users never compact/repair the db is that it will become slow due to bloat, and it may become unstable - meaning corrupted.
Compacting an Access database (also known as a MS JET database) is a bit like defragmenting a hard drive. Access (or, more accurately, the MS JET database engine) isn't very good with re-using space - so when a record is updated, inserted, or deleted, the space is not always reclaimed - instead, new space is added to the end of the database file and used instead.
A general rule of thumb is that if your [Access] database will be written to (updated, changed, or added to), you should allow for compacting - otherwise it will grow in size (much more than just the data you've added, too).
So, to answer your question(s):
Yes, it does matter (unless your database is read-only).
You should care (unless you don't care about your user's disk space).
If you don't compact an Access database, over time it will grow much, much, much larger than the data inside it would suggest, slowing down performance and increasing the possibilities of errors and corruption. (As a file-based database, Access database files are notorious for corruption, especially when accessed over a network.)
This article on How to Compact Microsoft Access Database Through ADO will give you a good starting point if you decide to add this functionality to your app.
I would offer the users a method for compacting the database. I've seen databases grow to 600+ megabytes when compacting will reduce to 60-80.
To echo Nate:
In older versions, I've had it corrupt databases - so a good backup regime is essential. I wouldn't code anything into your app to do that automatically. However, if a customer finds that their database is running really slow, your tech support people could talk them through it if need be (with appropriate backups of course).
If their database is getting to be so large that the compaction starts to be come a necessity though, maybe it's time to move to MS-SQL.
I've found that Access database files almost always get corrupted over time. Compacting and repairing them helps hold that off for a while.
Well it really matters! mdb files keep increasing in size each time you manipulate its data, until it reaches unbearable size. But you don't have to supply a compacting method through your interface. You can add the following code in your mdb file to have it compacted each time the file is closed:
Application.SetOption ("Auto Compact"), 1
I would also highly recommend looking in to VistaDB (http://www.vistadb.net/) or SQL Compact(http://www.microsoft.com/sql/editions/compact/) for your application. These might not be the right fit for your app... but are def worth a look.
If you don't offer your users a way to decompress and the raw size isn't an issue to begin with, then don't bother.