Internal Data Redundancy happens in Microsoft Access - ms-access

We are using ms access2010 and we are having unnecessary 50% increase of the data file problem
every day. We use the compact and repair process on a daily basis at every nights.
But almost every day, in the middle of day, when this huge increase happens and performance
is badly affected we have to run this process again manually,after that this huge size difference disappears. I suspect the problem would be because of the internal behaviour of Access engine while updating data.
Can anyone please explain to me when updating a record how much space is wasted internally by
data base engine?
For instance, suppose we have a record of 100 bytes, when we update it somehow and the size decreases to 80 how much will the wasted space be? is it 20 or much more than that?
Conversely, when we increase a data record by update will it be any wasted space created by the update process in data file?
any idea or suggestion on how to boost the performance would be appreciated.

You can run C&R via VBA
Public Sub CompactDB()
CommandBars("Menu Bar").Controls("Tools").Controls("Database utilities").Controls("Compact and repair database...").accDoDefaultAction
End Sub
Reasons your database can bloat (compacting only solves some of this -- decompiling / recompiling is necessary for the rest, if you code / use macros).
MS Access is file-based, not server transaction based, so you're
always writing and rewriting to the hard drive for a variable space.
To get around this, switch to MS Access ADP files using either MDSE,
which you can install from the MS Office Professional CD by browsing
to it on the CD (not part of the installation wizard), or, hook the
database up to a server, such as SqlServer. You'll have to build a
new MS Access document of type ADP (as opposed to MDB). Doing so
puts you in a different developmental regime, however, than you're
used to, so read about this before doing it.
Compiling. Using macros plus the "compile in background" option is no different than compiling your MS Access project by having coded in Access Basic, Visual Basic for Access, or Visual Basic using the VB Editor that comes with MS Access.
Whatever changes you made last time remain as compiled pseudocode, so you are pancaking one change on top of another, even though you only are playing with the lastest version of your code.
Queries, especially large queries, take up space when they're run which is never reclaimed until you compact. You can make your queries more efficient, but you'll never get away from this completely.
Locktypes, cursortypes, and cursorlocations on ADODB, depending on how you set them up, can take up a lot of space if you choose combinations that are really data intensive. These can be marshalled (configured) in such a way to return only what's necessary. There is a knowledge base article on the MDSN library at microsoft.com detailing how ADODB causes a lot of bloat, and recommends to use DAO, but this is a cop-out; what you do is use ADODB well and you'll get around this, and DAO does not eliminate bloat, either.
DAO functions.
Object creation -- tables, forms, controls, reports -- all take up space. If you create a form and delete it later, the space that the form is not reclaimed until you compact.
Cute pictures. These always take up space, and MS Access does not store them efficiently. A 20K JPEG can wind up like an 800K or 1MB bitmap format once stored in Access, and there's nothing you can do about that in MS Access 97. You can put the image on a form and use subform references of the image where ever you want it, but you still don't get around the inefficient storage format.
OLE Objects. If you have an OLE field and decide to insert, say, a spreadsheet in that field, you take the entire Excel Workbook with it, not just that sheet. Be careful how to use OLE objects.
Table properties with the subtable set to [auto]. Set this property, for all tables, to [none]. Depending on how many tables you have, performance can also perceptibly improve.
You can also get the Jet Compact utility from Microsoft.com for databases that are corrupted.
Source

Related

Access file size with SharePoint lists -

I am using Access as a front end, with many linked SharePoint lists as the data source. The database has no native tables; all data is stored in SharePoint lists. There is significant code and several forms.
I use the SharePoint list as a data source because (a) it is a data source easily accessible by the entire team that uses it, (b) my company restricts the availability of other sources - I'd prefer to have an SQL data source available, but this is not easily done - and (c) the data is company sensitive, and an off-site resource isn't an option.
As I've added code and forms, the file size has understandably grown, but it is MUCH larger than I would expect from a database that houses no data (currently over 21 meg).
I have no performance issues, other than the speed of the link to the SharePoint resource. That has not changed over time - the database performs about the same now (at 21M) as it did when it was a fledgling 10M at it's creation.
My question - should I expect a database with no inherent data to be this large? It seems to me that there is no way some forms, queries, and some vba are driving 21M of storage.
Thank you for you time - Mike.
Assuming you do a frequent compact and repair, then that should give you the smallest size.
I would check if forms have “image” backgrounds or “images” to fancy up the form. If you using the “older” image format, make sure you set Access to use the newer format.
Also keep in mind that Access 2010 and later will “copy” or better said cache a copy of the SharePoint tables local. This is done for reasons of performance. The setting that controls this is file->options->
Note that in above you can use “clear” cache on exist, but that results in a huge performance penalty. So best use settings as above.
So the SharePoint tables are copied local. (There is a local copy of the data). This would explain the rather large size despite you using linked tables to the data on SharePoint.

Why is my Access 2007 database growing so much?

I work on a Win32 application that has developed a very strange problem of the database quietly growing until finally it reaches the 2 GB file size limit. We use ADO to connect to an Access 2007 database. The application has worked nicely for years with no such difficulty being observed. as you may imagine, when it reaches the 2 GB limit, the database becomes corrupt. I have quite a few customer databases now that were sent to us for repair--all around 2GB in size. once compacted, they come back to < 10 MB.
we see some database growth over time but never growth on that sort of scale.
I made a small database "checker" that adds up the contents of all fields in all records to give some idea how much actual data is present. having checked this new tool on databases that have recently been compacted, I think the tool is working correctly. all the bloated databases have not more than 10 MB of data each.
We don't compact the database at app start. It has seemed to me that because we don't delete large amounts of data, compacting the database isn't something we "should" need to do. customers with larger databases (there are some but they are on earlier versions).
Can you suggest how we could have a database that should be <10 MB could grow to 2 GB?
A few remarks about what our app does:
any restructuring is done using DAO when ADO does not have the database open.
we do use transactions in a few places
for convenience, certain records are convenient to delete and recreate instead of find/edit/delete. typically this operation involves 5-30 records, each about 8K per record. this only occurs when the user presses "Save".
there are other record types that are about 70 KB/record but we're not using delete/recreate with them.
we use a BLOB ("OLEObject") field to store binary data.
thank you for any insights you can offer.
MS Access files bloat very easily. They store a lot of history of transactions, as well as retaining size during record deletion.
When I write an application with an Access database I factor regular compaction into the design as it is the only way to keep the database in line.
Compacting on close can present issues (depending on the environment) such as users forcing the compact to abort because they want their computer to finish shutting down at the end of the day. Equally, compact on open can cause frustrating delays where the user would like to get into the program but cannot.
I normally try and organise for the compact to be done as a scheduled task on an always on PC such as a server. Please follow the link for more information: http://support.microsoft.com/kb/158937
thank you all for your help. found where it happened:
var
tbl:ADOX_TLB.Table;
cat:ADOX_TLB.Catalog;
prop:ADOX_TLB.Property_;
begin
cat:=ADOX_TLB.CoCatalog.Create;
cat.Set_ActiveConnection(con.ConnectionObject);
// database growth here
tbl:=cat.Tables.Item[sTableName];
prop:=tbl.Properties['ValidationText'];
Result:=prop.Value;
prop:=nil;
tbl:=nil;
cat:=nil;
end;
each time this function was called, the database grew by about 32KB.
i changed to do this function less often and do it with DAO instead of ADO.
So doing a little research, I came across a discussion about how MS Access files will grow until compacted, even when data is deleted. From this I infer that they are storing the transaction history within the file. This means that they will continue to grow with each access.
The solution is compaction. You apparently need to compact the database regularly. You may want to do this on application close, instead of launch if it takes to long.
Also note that this means multi-operation changes (such as the delete then reinsert modified value mentioned above) will likely cause the file to expand more quickly.

How to compress an MS Access database

I have an .mdb file which is 70MB.
After deleting all records contained in the file, the size remains 70MB.
How do I make my .mdb file smaller?
Every database engine that has ever existed needs regular maintenance operations run on them to optimize data storage and to recover slack space. Back in xBase days, you ran a PACK command to remove deleted rows, for instance. On SQL Server, you run scripts to shrink the actual data files for the same reasons.
Why does every database engine do this?
Because it would be a huge performance hit if every write to the database had to rewrite the whole file in optimized order. Consider a database that stores each data table in a separate file. If a table has 10000 records, and you delete the 5000th record, to get rid of slack space, you'd have to rewrite the whole second half of the data file. Instead, every database uses some form of marking the space used as unused and discardable the next time the optimize operations are run on the data table.
Jet/ACE is no different in this regard than any other database engine and any application using a Jet/ACE database as a data store should have regular maintenance operations scheduled, including a backup and then a compact.
There are some issues with this in Jet/ACE that aren't present in server database engines. Specifically, you can't compact unless all users have closed their connections to the data file. In a server database, the users connect to the database engine's server-side process, and that server-side demon is the only "user" of the actual data files in which the data is stored. Thus, the server demon can decide when to perform the optimization and maintenance routines, since it's entirely in control of when the data files are in use or not.
One common problem with Access applications is that users will leave their application open on their computers and leave the office for the day, which means that when you run your compact operation, say at 2:00am, the file is still open and you can't run it (because compact replaces the original file). Most programmers of Access applications who encounter this problem will either tolerate the occasional failure of this kind of overnight maintenance (volume shadow copy still allows a backup of the file, though there's no guarantee that backup copy will be in a 100% internally consistent state), or they will engineer their Access applications to terminate at a time appropriate to allow overnight maintenance operations. I've done both, myself.
In non-Access applications, the same problem exists, but has to be tackled differently. For web applications, it's something of a problem, but in general, I'd say that any web app that churns the data enough that a compact would be needed is one for which a Jet/ACE data store is wholly inappropriate.
Now, on the subject of COMPACT ON CLOSE:
It should never be used by anyone.
Ever.
It's useless and downright dangerous when it actually kicks in.
It's useless because there's no properly-architected production environment in which users would ever be opening the back end -- if it's an Access app, it should be split, with users only ever opening the front end, and if it's a web app, users won't be interacting directly with the data file. So in both scenarios, nobody is ever going to trigger the COMPACT ON CLOSE, so you've wasted your time turning it on.
Secondly, even if somebody does occasionally trigger it, it's only going to work if that user is the only one with the database open. As I said above, it can't be compacted if there are other users with it open, so this isn't going to work, either -- COMPACT ON CLOSE can only run when the user triggering it has exclusive access.
But worst of all, COMPACT ON CLOSE is dangerous and if it does run can lead to actual data loss. This is because there are certain states an Jet/ACE datebase can be in wherein internal structures are out of whack, but the data is all still accessible. When the compact/repair operation is run in that state, data can potentially be lost. This is an extremely rare condition, but it is a very remote possibility.
The point is that COMPACT ON CLOSE is not conditional, and there is no prompt that asks you if you want to run it. You don't get a chance to do a backup before it runs, so if you have it turned on and it kicks in when your database is in that very rare state, you could lose data that you'd otherwise be able to recover if you did not run the compact operation.
So, in short, nobody with any understanding of Jet/ACE and compacting ever turns on COMPACT ON CLOSE.
For a single user, you can just compact as needed.
For a shared application, some kind of scheduled maintenance script is the best thing, usually running overnight on the file server. That script would make a backup of the file, then run the compact. It's quite a simple script to write in VBScript, and easily scheduled.
Last of all, if your application frequently deletes large numbers of records, in most cases that's an indication of a design error. Records that are added and deleted in regular production use are TEMPORARY DATA and don't belong in your main data file, both logically speaking and pragmatically speaking.
All of my production apps have a temp database as part of the architecture, and all temp tables are stored there. I never bother to compact the temp databases. If for some reason performance bogged down because of bloat within the temp database, I'd just copy a pristine empty copy of the temp database over top of the old one, since none of the data in there is anything other than temporary. This reduces churn and bloat in front end or back end and greatly reduces the frequency of necessary compacts on the back end data file.
On the question of how to compact, there are a number of options:
in the Access UI you can compact the currently open database (TOOLS | DATABASE UTILITIES). However, that doesn't allow you to make a backup as part of the process, and it's always a good idea to backup before compacting, just in case something goes wrong.
in the Access UI you can compact a database that is not open. This one compacts from an existing file to a new one, so when you're done you have to rename both the original and the newly compacted file (to have the new name). The FILE OPEN dialog that asks you what file to compact from does allow you to rename the file at that point, so you can do it as part of the manual process.
in code, you can use the DAO DBEngine.CompactDatabase method to do the job. This is usable from within Access VBA, or from a VBScript, or from any environment where you can use COM. You are responsible in your code for doing the backup and renaming files and so forth.
another option in code is JRO (Jet & Replication Objects), but it offers nothing in regard to compact operations that DAO doesn't already have. JRO was created as a separate library to handle Jet-specific features that were not supported in ADO itself, so if you're using ADO as your interface, the MS-recommended library for compacting would be JRO. From within Access, JRO is inappropriate for compact, as you'd already have the CompactDatabase method available, even if you don't have a DAO reference (the DBEngine is always available in Access whether or not you have a DAO reference). In other words, DBEngine.CompactDatabase can be used within Access without either a DAO or ADO reference, where as the JRO CompactDatabase method is only available with a JRO reference (or using late binding). From outside of Access, JRO may be the appropriate library.
Let me stress how important backups are. You won't need it 999 times out of 1000 (or even less often), but when you need it, you'll need it bad! So never compact without making a backup beforehand.
Finally, after any compact, it's a good idea to check the compacted file to see if there's a system table called MSysCompactErrors. This table will list any problems encountered during the compact, if there were any.
That's all I can think of regarding compact for now.
Open the mdb and do a 'Compact and Repair'. This will reduce the size of the mdb.
You can also set the 'Compact on Close' option to on (off by default).
Here is a link to some additional information:
http://www.trcb.com/computers-and-technology/data-recovery/ways-to-compact-and-repair-an-access-database-27384.htm
The Microsoft Access database engine provides a CompactDatabase method that makes a compact copy of the database file. The database file must be closed before calling CompactDatabase.
Documentation:
Pages on microsoft.com about "Compact and Repair Database"
DBEngine.CompactDatabase Method (DAO)
Here's a Python script that uses DAO to copy and compact MDB files:
import os.path
import sys
import win32com.client
# Access 97: DAO.DBEngine.35
# Access 2000/2003: DAO.DBEngine.36
# Access 2007: DAO.DBEngine.120
daoEngine = win32com.client.Dispatch('DAO.DBEngine.36')
if len(sys.argv) != 3:
print("Uses Microsoft DAO to copy the database file and compact it.")
print("Usage: %s DB_FILE FILE_TO_WRITE" % os.path.basename(sys.argv[0]))
sys.exit(2)
(src_db_path, dest_db_path) = sys.argv[1:]
print('Using database "%s", compacting to "%s"' % (src_db_path, dest_db_path))
daoEngine.CompactDatabase(src_db_path, dest_db_path)
print("Done")
With python you can compact with the pypyodbc libary (either .mdb or .accdb)
import pypyodbc
pypyodbc.win_compact_mdb('C:\\data\\database.accdb','C:\\data\\compacted.accdb')
(source)
Then you can copy compacted.accdb back to database.accdb with shutil:
import shutil
shutil.copy2('C:\\data\\compacted.accdb','C:\\data\\database.accdb')
(source)
Note: As far as I know for Access DB with ODBC, python and its libraries must be 32bit (link). Also, these steps probably only work with Windows OS.

What do Repair and Compact operations do to an .MDB? Will they stop an application crashing?

What do Repair and Compact operations do to an .MDB?
If these operations do not stop a 1GB+ .MDB backed VB application crashing, what other options are there?
Why would a large sized .MDB file cause an application to crash?
"What do compact and repair operations do to an MDB?"
First off, don't worry about repair. The fact that there are still commands that purport to do a standalone repair is a legacy of the old days. That behavior of that command was changed greatly starting with Jet 3.51, and has remained so since that. That is, a repair will never be performed unless Jet/ACE determines that it is necessary. When you do a compact, it will test whether a repair is needed and perform it before the compact.
So, what does it do?
A compact/repair rewrites the data file, elmininating any unused data pages, writing tables and indexes in contiguous data pages and flagging all saved QueryDefs for re-compilation the next time they are run. It also updates certain metadata for the tables, and other metadata and internal structures in the header of the file.
All databases have some form of "compact" operation because they are optimized for performance. Disk space is cheap, so instead of writing things in to use storage efficiently, they instead write to the first available space. Thus, in Jet/ACE, if you update a record, the record is written to the original data page only if the new data fits within the original data page. If not, the original data page is marked unused and the record is rewritten to an entirely new data page. Thus, the file can become internally fragmented, with used and unused data pages mixed in throughout the file.
A compact organizes everything neatly and gets rid of all the slack space. It also rewrites data tables in primary key order (Jet/ACE clusters on the PK, but that's the only index you can cluster on). Indexes are also rewritten at that point, since over time those become fragmented with use, also.
Compact is an operation that should be part of regular maintenance of any Jet/ACE file, but you shouldn't have to do it often. If you're experiencing regular significant bloat, then it suggests that you may be mis-using your back-end database by storing/deleting temporary data. If your app adds records and deletes them as part of its regular operations, then you have a design problem that's going to make your data file bloat regularly.
To fix that error, move the temp tables to a different standalone MDB/ACCDB so that the churn won't cause your main data file to bloat.
On another note not applicable in this context, front ends bload in different ways because of the nature of what's stored in them. Since this question is about an MDB/ACCDB used from VB, I'll not go into details, but suffice it to say that compacting a front end is something that's necessary during development, but only very seldom in production use. The only reason to compact a production front end is to update metadata and recompile queries stored in it.
It's always been that MDB files become slow and prone to corruption as they get over 1GB, but I've never known why - it's always been just a fact of life. I did some quick searching, I can't find any official, or even well-informed insider, explanations of why this size is correlated with MDB problems, but my experience has always been that MDB files become incredibly unreliable as you approach and exceed 1GB.
Here's the MS KB article about Repair and Compact, detailing what happens during that operation:
http://support.microsoft.com/kb/209769/EN-US/
The app probably crashes as the result of improper/unexpected data returned from a database query to an MDB that large - what error in particular do you get when your application crashes? Perhaps there's a way to catch the error and deal with it instead of just crashing the application.
If it is crashing a lot then you might want to try a decompile on the DB and/or making a new database and copying all the objects over to the new container.
Try the decompile first, to do that just add the /decompile flag to the startup options of your DB for example
“C:\Program Files\access\access.mdb” “C:\mydb.mdb” /decompile
Then compact, compile and then compact again
EDIT:
You cant do it without access being installed but if it is just storing data then a decompile will not do you any good. You can however look at jetcomp to help you with you compacting needs
support.microsoft.com/kb/273956

Why should I care about compacting an MS Access .mdb file?

We distribute an application that uses an MS Access .mdb file. Somebody has noticed that after opening the file in MS Access the file size shrinks a lot. That suggests that the file is a good candidate for compacting, but we don't supply the means for our users to do that.
So, my question is, does it matter? Do we care? What bad things can happen if our users never compact the database?
In addition to making your database smaller, it'll recompute the indexes on your tables and defragment your tables which can make access faster. It'll also find any inconsistencies that should never happen in your database, but might, due to bugs or crashes in Access.
It's not totally without risk though -- a bug in Access 2007 would occasionally delete your database during the process.
So it's generally a good thing to do, but pair it with a good backup routine. With the backup in place, you can also recover from any 'unrecoverable' compact and repair problems with a minimum of data loss.
Make sure you compact and repair the database regularly, especially if the database application experiences frequent record updates, deletions and insertions. Not only will this keep the size of the database file down to the minimum - which will help speed up database operations and network communications - it performs database housekeeping, too, which is of even greater benefit to the stability of your data. But before you compact the database, make sure that you make a backup of the file, just in case something goes wrong with the compaction.
Jet compacts a database to reorganize the content within the file so that each 4 KB "page" (2KB for Access 95/97) of space allotted for data, tables, or indexes is located in a contiguous area. Jet recovers the space from records marked as deleted and rewrites the records in each table in primary key order, like a clustered index. This will make your db's read/write ops faster.
Jet also updates the table statistics during compaction. This includes identifying the number of records in each table, which will allow Jet to use the most optimal method to scan for records, either by using the indexes or by using a full table scan when there are few records. After compaction, run each stored query so that Jet re-optimizes it using these updated table statistics, which can improve query performance.
Access 2000, 2002, 2003 and 2007 combine the compaction with a repair operation if it's needed. The repair process:
1 - Cleans up incomplete transactions
2 - Compares data in system tables with data in actual tables, queries and indexes and repairs the mistakes
3 - Repairs very simple data structure mistakes, such as lost pointers to multi-page records (which isn't always successful and is why "repair" doesn't always work to save a corrupted Access database)
4 - Replaces missing information about a VBA project's structure
5 - Replaces missing information needed to open a form, report and module
6 - Repairs simple object structure mistakes in forms, reports, and modules
The bad things that can happen if the users never compact/repair the db is that it will become slow due to bloat, and it may become unstable - meaning corrupted.
Compacting an Access database (also known as a MS JET database) is a bit like defragmenting a hard drive. Access (or, more accurately, the MS JET database engine) isn't very good with re-using space - so when a record is updated, inserted, or deleted, the space is not always reclaimed - instead, new space is added to the end of the database file and used instead.
A general rule of thumb is that if your [Access] database will be written to (updated, changed, or added to), you should allow for compacting - otherwise it will grow in size (much more than just the data you've added, too).
So, to answer your question(s):
Yes, it does matter (unless your database is read-only).
You should care (unless you don't care about your user's disk space).
If you don't compact an Access database, over time it will grow much, much, much larger than the data inside it would suggest, slowing down performance and increasing the possibilities of errors and corruption. (As a file-based database, Access database files are notorious for corruption, especially when accessed over a network.)
This article on How to Compact Microsoft Access Database Through ADO will give you a good starting point if you decide to add this functionality to your app.
I would offer the users a method for compacting the database. I've seen databases grow to 600+ megabytes when compacting will reduce to 60-80.
To echo Nate:
In older versions, I've had it corrupt databases - so a good backup regime is essential. I wouldn't code anything into your app to do that automatically. However, if a customer finds that their database is running really slow, your tech support people could talk them through it if need be (with appropriate backups of course).
If their database is getting to be so large that the compaction starts to be come a necessity though, maybe it's time to move to MS-SQL.
I've found that Access database files almost always get corrupted over time. Compacting and repairing them helps hold that off for a while.
Well it really matters! mdb files keep increasing in size each time you manipulate its data, until it reaches unbearable size. But you don't have to supply a compacting method through your interface. You can add the following code in your mdb file to have it compacted each time the file is closed:
Application.SetOption ("Auto Compact"), 1
I would also highly recommend looking in to VistaDB (http://www.vistadb.net/) or SQL Compact(http://www.microsoft.com/sql/editions/compact/) for your application. These might not be the right fit for your app... but are def worth a look.
If you don't offer your users a way to decompress and the raw size isn't an issue to begin with, then don't bother.