Inspect large linked database in Access

Inspect large linked database in Access - ms-access

This sounds like a quite simple and straightforward question, but I tried to search online and could not find an answer to my problem. I would like to view the linked database from Access but the database is too large and every step takes forever to load the data. I wonder if there is a better way to inspect the data tables? Sorry if this has been asked somewhere else, I am a bit new to Access.

Well, you have the program part (often called the front end (FE).
Then you have linked tables to the data file 9often called the back end (BE).
So, I can't say there necessary going to be much difference then just looking at the list of linked tables in the nav pane (FE).
Or, you can fire up access, and open the BE file. At that point, you will again see the "list" of tables in the nav pane. About the only difference here is that you as a general rule can't make changes to the table structure(s) in the FE.
But, other than that, the performance should not be much different. Of course if you are on a network and the BE is in some folder? Well then your network connection of course can and will effect performance.
So, in that case, what one often does is simply copy the BE from the server folder to that of a local folder. You can then open + use + play + consume that database (BE) 100% local on your computer without a network between you and the data file. This will of course run MUCH faster, and thus let you see/play and look at the tables and open them to see data inside such tables.
So, all in all? Copy the BE to a local folder. You be working on a copy of the data (that's safe - can't mess up production data), but certainly performance wise you find that any performance considerations should be quite much eliminated.
And for development and testing? Often we take the BE and place it on our local computer (say laptop) and thus work with that BE local. And depending on how the FE (program/software part) is setup, often it will have some options to re-link and thus you can point the FE to a different BE.
Just keep in mind that if you make changes to the BE? And you want such changes from that copy to appear or be made on the production BE? Well, you have to make notes, since there not really a automated way to send changes (say new tables, or changes to table designs) to the production BE. And of course, one has to be VERY careful. You can make changes to the tables such as re-name, or changing field names - that will for sure break the FE program part. You can in most cases of course add new fields/columns to existing tables, and that in most cases should not break your software.
But, from a performance point of view? I am somewhat perplexed you note performance issues and problems. Perhaps there is some VPN between the FE and BE (and that does not work well at all - you in general require a good solid network connection - a LAN (not a VPN/WAN) between the FE and BE. If a VPN (WAN) is to be adopted, then in most cases the BE needs to be migrated to sql server - the FE (program) part can then used linked tables to SQL server, and not a file based BE.
So while above should make sense - it is somewhat perplexing the performance issue you dealing with, or that you note here? (that does not quite make a whole lot of sense).

Related

What is the best way to prevent Access database bloat

Intro:
I am creating a Access database system that will be rolled out with multi-user functionality.
But as i am creating this database in Access 2000 (Old school I know) there are quite a lot of bugs and random mysterious problems that occur when my database gets passed 40-60MB.
My question:
Has anyone got a good solution to how I can shrink this down or to prevent the bloat?
Details:
I am using many local tables combined with SQL Tables and my front-end links to a back-end SQL Server.
I have already tried compact and repair but it only ever shrinks it to about 15MB and after the user has used the database a few time the bloat expands quickly to over 50-60MB!
Let me know if more detail is needed but that is the rough outline of my problem.
Many Thanks!

Here's some ideas for you to follow.
You said you also have a lot of local tables. Split the local tables off into yet another Access database. So you'll have 2 back-ends (1 SQL Server & 1 Access), and the front end.
Create a batch file that opens your local tables backend database with the /compact option. So, it will look something like this:
"C:\Prog...\Microsoft...\Officexx\ C:\ProjectX_backend.mdb /compact"
Then run this batch file on a daily basis using scheduled tasks. Your frontend should never need compacting unless you edit it in any way.

If you are stuck with 2000, which has a quite bad reputation, then you have to dig down into your application and find out what creates the bloat. The most common reason are bulk inserts followed by deletes. Other reasons, are the use of OLE Object fields. Other reasons are programmatic changes in in form, etc objects. You really have to go through your application and find the specific cause.
An mdb file that is only connected to a backed server and does not make changes to local objects should not grow.
As for your random issues, besides some lack of stability in the 2000 version, you should look into bad RAM in the computers, bad hard drives, and broken network controllers if your mdb file is shared on the network.

Storing image in database vs file system (is this a valid use case?)

I have an application where every user gets there own database and runs from the same file system folder. (the database is determined by sub domain)
Storing in the filesystem could lead to conflict. I'd imagine the images upload would be small. (I would scale them down before storing)
Is it ok in this case to store in database?
(I know this has been asked a lot)
I also want to make my application easy to install and creating a writable folder is hard for some people)

To take the contrary view from Nathanial -- I find it easier to use the data base to store opaque data like images. When you back up the data base, you automatically get a backup of the images. Also, you can retrieve, update, or delete the image along with all the other data in integrated SQL queries; keeping the files separately means writing much more complex code that has to go out to the file system to maintain data integrity every time you issue certain SQL queries. Locking can be a big problem, and transaction processing (especially rollback) even bigger.

Seems like you've already sort of talked yourself into it, but in my experience it's better to store files in a filesystem and data in a database. Use GUID's for the file names if you are worried about a conflict.

Pasting my answer from a similar post: I have implemented both solutions (file system and database-persisted images) in previous projects. In my opinion, you should store images in your database. Here's why:
File system storage is more complicated when your app servers are
clustered. You have to have shared storage. Even if your current
environment is not clustered, this makes it more difficult to scale
up when you need to
You should be using a CDN for your static
content anyways, and set your app up as the origin. This means that
your app will only be hit once for a given image, then it will be
cached on the CDN. CloudFront is dirt cheap and simple to set
up...there's no reason not to use it. Save your bandwidth for your
dynamic content.
It's much quicker (and thus cheaper) to develop
database persisted images
You get referential integrity with
database persisted images. If you're storing images on the file
system, you will inevitably have orphan files with no matching
database records, or you'll have database records with broken file
links. This WILL happen...it's just a matter of time. You'll have to
write something to clean these up.
Anyways, my two cents.

Collaborating on websites with relational databases and a CMS

What processes do you put in place when collaborating in a small team on websites with databases?
We have no problems working on site files as they are under revision control, so any number of our developers can work from any location on this aspect of a website.
But, when database changes need to be made (either directly as part of the development or implicitly by making content changes in a CMS), obviously it is difficult for the different developers to then merge these database changes.
Our approaches thus far have been limited to the following:
Putting a content freeze on the production website and having all developers work on the same copy of the production database
Delegating tasks that will involve database changes to one developer and then asking other developers to import a copy of that database once changes have been made; in the meantime other developers work only on site files under revision control
Allowing developers to make changes to their own copy of the database for the sake of their own development, but then manually making these changes on all other copies of the database (e.g. providing other developers with an SQL import script pertaining to the database changes they have made)
I'd be interested to know if you have any better suggestions.
We work mainly with MySQL databases and at present do not keep track of revisions to these databases. The problems discussed above pertain mainly to Drupal and Wordpress sites where a good deal of the 'development' is carried out in conjunction with changes made to the database in the CMS.

You put all your database changes in SQL scripts. Put some kind of sequence number into the filename of each script so you know the order they must be run in. Then check in those scripts into your source control system. Now you have reproducible steps that you can apply to test and production databases.

While you could put all your DDL into the VC, this can get very messy very quickly if you try to manage lots and lots of ALTER statements.
Forcing all developers to use the same source database is not a very efficient approach either.
The solution I used was to maintain a file for each database entity specifying how to create the entity (primarily so the changes could be viewed using a diff utility), then manually creating ALTER statements by comparing the release version with the current version - yes, it is rather labour intensive but the only way I've found to solve the problem.
I had a plan to automate the generation of the ALTER statements - it should be relatively straightforward - indeed a quick google found this article and this one. Never got round to implementing one myself since the effort of doing so was much greater than the frequency of schema changes on the projects I was working on.

Where i work, every developer (actually, every development virtual machine) has its own database (or rather, its own schema on a shared Oracle instance). Our working process is based around complete rebuilds. We don't have any ability to modify an existing database - we only ever have the nuclear option of blowing away the whole schema and building from scratch.
We have a little 'drop everything' script, which uses queries on system tables to identify every object in the schema, constructs a pile of SQL to drop them, and runs it. Then we have a stack of DDL files full of CREATE TABLE statements, then we have a stack of XML files containing the initial data for the system, which are loaded by a loading tool. All of this is checked into source control. When a developer does an update from source control, if they see incoming database changes (DDL or data), they run the master build script, which runs them in order to create a fresh database from scratch.
The good thing is that this makes life simple. We never need to worry about diffs, deltas, ALTER TABLE, reversibility, etc, just straightforward DDL and data. We never have to worry about preserving the state of the database, or keeping it clean - you can get back to a clean state at the push of a button. Another important feature of this is that it makes it trivial to set up a new platform - and that means that when we add more development machines, or need to build an acceptance system or whatever, it's easy. I've seen projects fail because they couldn't build new instances from their muddled databases.
The main bad thing is that it takes some time - in our case, due to the particular depressing details of our system, a painfully long time, but i think a team that was really on top of its tools could do a complete rebuild like this in 10 minutes. Half an hour if you have a lot of data. Short enough to be able to do a few times during a working day without killing yourself.
The problem is what you do about data. There are two sides to this: data generated during development, and live data.
Data generated during development is actually pretty easy. People who don't work our way are presumably in the habit of creating that data directly in the database, and so see a problem in that it will be lost when rebuilding. The solution is simple: you don't create the data in the database, you create it in the loader scripts (XML in our case, but you could use SQL DML, or CSV with your database's import tool, or whatever). Think of the loader scripts as being source code, and the database as object code: the scripts are the definitive form, and are what you edit by hand; the database is what's made from them.
Live data is tougher. My company hasn't developed a single process which works in all cases - i don't know if we just haven't found the magic bullet yet, or if there isn't one. One of our projects is taking the approach that live is different to development, and that there are no complete rebuilds; rather, they have developed a set of practices for identifying the deltas when making a new release and applying them manually. They release every few weeks, so it's only a couple of days' work for a couple of people that often. Not a lot.
The project i'm on hasn't gone live yet, but it is replacing an existing live system, so we have a similar problem. Our approach is based on migration: rather than trying to use the existing database, we are migrating all the data from it into our system. We have written a rather sprawling tool to do this, which runs queries against the existing database (a copy of it, not the live version!), then writes the data out as loader scripts. These then feed into the build process just like any others. The migration is scripted, and runs every night as part of our daily build. In this case, the effort needed to write this tool was necessary anyway, because our database is very different in structure to the old one; the ability to do repeatable migrations at the push of a button came for free.
When we go live, one of our options will be to adapt this process to migrate from old versions of our database to new ones. We'll have to write completely new queries, but they should be very easy, because the source database is our own, and the mapping from it to the loader scripts is, as you would imagine, straightforward, even as the new version of the system drifts away from the live version. This would let us keep working in the complete rebuild paradigm - we still wouldn't have to worry about ALTER TABLE or keeping our databases clean, even when we're doing maintenance. I have no idea what the operations team will think of this idea, though!

You can use the replication module of the database engine, if it has one.
One server will be the master, changes are to be made on it.
Developers copies will be slaves.
Any changes on the master will be duplicated on the slaves.
It's a one way replication.
Can be a bit tricky to put into place as any changes on the slaves will be erased.
Also it means that the developers should have two copy of the database.
One will be the slave and another the "development" database.
There are also tools for cross database replications.
So any copies can be the master.
Both solutions can lead to disasters (replication errors).
The only solution is see fit is to have only one database for all developers and save it several times a day on a rotating history.
Won't save you from conflicts but you will be able to restore the previous version if it happens (and it always do...).

Where I work we are using Dotnetnuke and this poses the same problems. i.e. once released the production site has data going into the database as well as files being added to the file system by some modules and in the DNN file system.
We are versioning the site file system with svn which for the most part works ok. However, the database is a different matter. The best method we have come across so far is to use RedGate tools to synchronise the staging database with the production database. RedGate tools are very good and well worth the money.
Basically we all develop locally with a local copy of the database and site. If the changes are major we branch. Then we commit locally and do a RedGate merge to put our DB changes on the the shared dev server.
We use a shared dev server so others can do the testing. Once complete we then update the site on staging with svn and then merge the database changes from the development server to the staging server.
Then to go live we do the same from staging to prod.
This method works but is prone to error and is very time consuming when small changes need to be made. The prod DB is always backed up so we can roll back easily if a delivery goes wrong.
One major headache we have is that Dotnetnuke uses identity cols in many tables and if you have data going into tables on development and production such as tabs and permissions and module instances you have a nightmare syncing them. Ideally you want to find or build a cms that uses GUI's or something else in the database so you can easily sync tables that are in use concurrently.
We'd love to find a better method! As we have a lot of trouble with branching and merging when projects are concurrent.
Gus

Growing Access Frontend: Should I be concerned?

I've read opinions across the internet that say if you design or MS Access FrontEnd properly, it shouldn't shrink too much when you do a compact. I've got one front end I'm using that is typically around 15 MB when compacted, but grows to 20-25 MB while I'm working on it! Is this something I should be concerned about?

There is a distinction between development and production use.
during development, bloat should be expected -- you're churning the data pages in your front end, revising forms, reports, modules, etc., so there will be frequent discarding of data pages. There is nothing wrong with this. During development, you should compact regularly, and occasionally decompile (not often -- I tend to do it maybe once a day during heavy development, and/or immediately before distributing a new front end into production use).
during production use, a properly-designed front end should not bloat much. Yes, when you supply a compiled and compacted front end, it will grow some during use, but after a while, that growth should top off. But you shouldn't be concerned about that, as front ends are fungible. If something goes wrong with one, you just replace it with a new one.
The most common reason people encounter bloat in front ends is because they design them incorrectly, including temporary data in their front end (e.g., a table that has data appended to it that is then deleted). Temp data belongs in a temp file. All of my apps have a tmp.mdb that is distributed along with the front end and stored in the same folder as the front end, and all temporary data is stored there. I generally never bother to compact temp files.
Other sources of bloat might include:
design changes to forms/reports made in code (which would be the same in terms of bloat as the human developer making the same changes). This is almost always a design error, in my opinion.
changes to saved QueryDefs in the app. This one is less significant, as the amount of bloat is quite small compared to other types of bloat. However, if this is being done thousands of times in a session, it could theoretically reach the level of significance. There are a few good reasons to edit saved QueryDefs at runtime, but not very many, so while I wouldn't call doing this a design error, it would be a red flag that it needed to be checked to make sure it's not something that can be accomplished efficiently without editing the saved QueryDef.

As you are adding reports and so on, I do not think you should be concerned. I suggest that you decompile* fairly regularly when you are working on code, forms and reports.
* http://wiki.lessthandot.com/index.php/Decompile

Growing Front-End? To stupid to be true but it works. My database is used (through the cloud) by several companies and therefore the application can hardly ever be closed for compression (The last one to leave puts the lights out: compresses the database). My customers need to be online in the database all of the time. In less than one week the front-end used to grow from 16Mb to over 2Gb! This was scaring the hell out of me.
Solution: In the file explorer, simply right-click the front-end database, click 'properties' and check the 'read-only'-box.
Access will try to write the enlarged front-end but does not crash on the read-only flag. Again: just to simple to be true!
Best regards, Jaap Schokker, miniPLEX B.V., Wageningen, Holland

Can splitting .MDB files into segments help with stability?

Is this a realistic solution to the problems associated with larger .mdb files:
split the large .mdb file into
smaller .mdb files
have one 'central' .mdb containing
links to the tables in the smaller
.mdb files
How easy would it be to make this change to an .mdb backed VB application?
Could the changes to the database be done so that there are no changes required to the front-end application?

Edit Start
The short answer is "No, it won't solve the problems of a large database."
You might be able to overcome the DB size limitation (~2GB) by using this trick, but I've never tested it.
Typically, with large MS Access databases, you run into problems with speed and data corruption.
Speed
Is it going to help with speed? You still have the same amount of data to query and search through, and the same algorithm. So all you are doing is adding the overhead of having to open up multiple files per query. So I would expect it to be slower.
You might be able to speed it up by reducing the time time that it takes to ge tthe information off of disk. You can do this in a few ways:
faster drives
put the MDB on a RAID (anecdotally RAID-1,0 may be faster)
split the MDB up (as you suggest) into multiple MDBs, and put them on separate drives (maybe even separate controllers).
(how well this would work in practice vs. theory, I can't tell you - if I was doing that much work, I'd still choose to switch DB engines)
Data Corruption
MS Access has a well deserved reputation for data corruption. To be fair, I haven't had it happen to me fore some time. This may be because I've learned not to use it for anything big; or it may be because MS has put a lot of work in trying to solve these problems; or more likely a combination of both.
The prime culprits in data corruption are:
Hardware: e.g., cosmic rays, electrical interference, iffy drives, iffy memory and iffy CPUs - I suspect MS Access does not have as good error handling/correcting as other Databases do.
Networks: lots of collisions on a saturated network can confuse MS Access and convince it to scramble important records; as can sub-optimally implemented network protocols. TCP/IP is good, but it's not invincible.
Software: As I said, MS has done a lot of work on MS Access over the years, if you are not up to date on your patches (MS Office and OS), get up to date. Problems typically happen when you hit extremes like the 2GB limit (some bugs are hard to test and won't manifest them selves except at the edge cases, which makes the less likely to have been seen or corrected, unless reported by a motivated user to MS).
All this is exacerbated with larger databases, because larger databases typically have more users and more workstations accessing it. Altogether the larger database and number of users multiply to provide more opportunity for corruption to happen.
Edit End
Your best bet would be to switch to something like MS SQL Server. You could start by migrating your data over, and then linking one MDB to to it. You get the stability of SQL server and most (if not all) of your code should still work.
Once you've done that, you can then start migrating your VB app(s) over to us SQL Server instead.

If you have more data than fits in a single MDB then you should get a different database engine.
One main issue that you should consider is that you can't enforce referential integrity between tables stored in different MDBs. That should be a show-stopper for any actual database.
If it's not, then you probably don't have a proper schema designed in the first place.

For reasons more adequately explained by CodeSlave the answer is No and you should switch to a proper relational database.
I'd like to add that this does not have to be SQL Server. Quite possibly the reason why you are reluctant to do this is one of cost, SQL Server being quite expensive to obtain and deploy if you are not in an educational or charitable organisation (when it's remarkably cheap and then usually a complete no-brainer).
I've recently had extremely good results moving an Access system from MDB to MySQL. At least 95% of the code functioned without modification, and of the remaining 5% most was straightforward with only a few limited areas where significant effort was required. If you have sloppy code (not closing connections or releasing objects) then you'll need to fix these, but generally I was remarkably surprised how painless this approach was. Certainly I would highly recommend that if the reason you are reluctant to move to a database backend is one of cost then you should not attempt to manipulate .mdb files and go instead for the more robust database solution.

Hmm well if the data is going through this central DB then there is still going to be a bottle neck in there. The only reason I can think why you would do this is to get around the size limit of an access mdb file.
Having said that if the business functions can be split off in the separate applications then that might be a good option with a central DB containing all the linked tables for reporting purposes. I have used this before to good effect

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008