How to avoid data redundancy when copying between different DBMS? - mysql

I'm planning to create an VB.net application for retrieving data from a database (MS Access) and store it to a web server (MySQL data base). I really have confusion in my mind. I'm planning to use task scheduler so that the program will automatically run. I'm planning to set the time every 5 minutes.
How can I avoid the redundancy of data?
For example, I'm planning to get the sales for 5 minutes, after 5 minutes I will do it again. I think there will be redundancy in that case. I would like to ask your ideas about this scenario: how would you handle it?

If at all possible you should avoid using two databases in a situation like this.
Look for information on the linked table manager -- the data that Access uses doesn't have to be stored in Access.
http://www.mssqltips.com/sqlservertip/1480/configure-microsoft-access-linked-tables-with-a-sql-server-database/
If you have to do this, then see about using/upgrading to Access 2010 and use data macros (triggers), to put the new/changed data into temp tables that you clear out once you've copied the data over.

In a comment you said "i dont have any idea about how to replace the native tables with ODBC".
Is that the only obstacle which prevents you consolidating the data into one set in MySQL? If so, try this suggestion for setting ODBC links to MySQL tables.
Install an ODBC driver for MySQL, if you don't have one already. The latest version is available here: Download Connector/ODBC
Create a DSN (Data Source Name) for your MySQL database from the Windows ODBC Data Source Administrator.
Create a new Access database and use the DSN to create links with guidance from the web page link #jmoreno provided.
If the Access names of the linked tables are different than the names you originally used for the native Access tables, change them to match those original names.
Then you can import your forms, queries, reports, etc. from the old Access application. Ideally everything will just work, since Access will find the table names it needs and won't care that they are external instead of native tables. However you many need to resolve any data type incompatibilities between Access and MySQL.
You would need the MySQL ODBC driver on each machine where the Access application is used. Personally I would prefer to deal with that rather than the challenges of synchronizing between separate Access and MySQL data stores. (YMMV)
When you're ready to deploy, you can convert the ODBC links to DSN-less connections so the client machines wouldn't need to each have the DSN configured. See Using DSN-Less Connections by Doug Steele, Access MVP, for detailed instructions.

You will need to think very carefully about how you identify the data which has changed since the last synchronization cycle. If every row of data has a 'last updated' timestamp (that is indexed) then you could write a process that selected the recently updated rows from each table in turn. That's apt to be a bit heavy on the originating database (MS Access), plus you still have to identify the corresponding row to replace (where replacement is required) in the MySQL database. Of course, you can put different tables on different change schedules. For example, the table of US states probably doesn't change once a year, but your customer orders tables (or SO questions and answers tables) may change a lot in five minutes.
Some DBMS have alternative mechanisms, especially for working between copies of themselves. Some DBMS also provide a mechanism that is sometimes called 'changed data capture' (CDC) that allows you to get the changed data. Sometimes, in DBMS where you have a 'transaction log' or 'logical log' (but not CDC or something similar), you can 'mine' the log files (or log backups) to find the changes. However, the logs are typically optimized for the DBMS internal recovery processes, not for your use.

Well, obviously you will have to keep track of data items (may be in a different metadata space/datastore) that you have already processed to avoid the redundancy. The metadata should be used to filter out records that have been processed from the source. The logic and what needs to be in the metadata would depend on the exact use case here.

Related

Best database model for saas application (1 db per account VS 1 db for everyone)

Little question, I'm developing a saas software (erp).
I designed it with 1 database per account for these reasons :
I make a lot of personalisation, and need to add specific table columns for each account.
Easier to manage db backup (and reload data !)
Less risky : sometimes I need to run SQL queries on a table, in case of an error with bad query (update / delete...), only one customer is affected instead of all of them.
Bas point : I'm turning to have hundreds of databases...
I'm hiring a company to manage my servers, and they said that it's better to have only one database, with a few tables, and put all data in the same tables with column as id_account. I'm very very surprised by these words, so I'm wondering... what are your ideas ?
Thanks !
Frederic
The current environment I am working in, we handle millions of records from numerous clients. Our solution is to use Schema to segregate each individual client. A schema allows you to partition your clients into separate virtual databases while inside a single db. Each schema will have an exact copy of the tables from your application.
The upside:
Segregated client data
data from a single client can be easily backed up, exported or deleted
Programming is still the same, but you have to select the schema before db calls
Moving clients to another db or standalone server is a lot easier
adding specific tables per client is easier (see below)
single instance of the database running
tuning the db affects all tenants
The downside:
Unless you manage your shared schema properly, you may duplicate data
Migrations are repeated for every schema
You have to remember to select the schema before db calls
hard pressed to add many negatives... I guess I may be biased.
Adding Specific Tables: Why would you add client specific tables if this is SAAS and not custom software? Better to use a Postgres DB with a Hstore field and store as much searchable data as you like.
Schemas are ideal for multi-tenant databases Link Link
A lot of what I am telling you depends on your software stack, the capabilities of your developers and the backend db you selected (all of which you neglected to mention)
Your hardware guys should not decide your software architecture. If they do, you are likely shooting yourself in the leg before you even get out of the gate. Get a good senior software architect, the grief they will save you, will likely save your business.
I hope this helps...
Bonne Chance

Oracle 11g for a MySQLer, concept of database

I come from a strong experience in MySQL, and I am now starting with Oracle. But I find really difficult to understand what a DATABASE is in Oracle, given that they use similar concepts which I am struggling to differentiate. In mysql, there is a simple concept of "database" instead of a mixture of
SCHEMA concept (User's woprkspace logically divided by TABLESPACES)
TNS and SID/SERVICES concept
CONNECTION concept (in both ODBC definition and SQLDeveloper)
I won't ask a pure definition of them as I am still reading, but just some guidance on how can I map a mysql database in the closest Oracle concept.
This is the information I can give you coming from the perspective of a developer. I don't know huge amounts about Oracle, but I have done some fairly significant work with deploying to it for some applications that are now in production.
Database
A database, in Oracle terms, is a group of files that live on disk and are managed as a cohesive unit. The database contains almost everything: logins, roles, tables, indexes, temporary space, transaction logs, and so on. Creating one is a nontrivial task in Oracle. It basically requires direct access (as in SSH or Windows Remote Desktop) to the machine. It's common for a DBA to create one during installation and for that to be the only one the server ever hosts. Unlike in MySQL, PostgreSQL, and SQL Server, you can't really use this level for basic grouping. E.g., giving each developer their own database is uncommon because of the overhead in recreating it.
Schema
Oracle schemas conflate two purposes: users and namespaces.
Each schema is a user, and it can be associated with credentials (a password, a user in Active Directory). Note that all accounts are database specific; there is no way to create a user that can log into multiple databases (aside from pointing both databases at the same LDAP server or otherwise involving some external service).
The schema also acts as a namespace that contains objects (e.g., tables, views, procedures, and indexes), and the schema name can be used explicitly to qualify exactly which object you're trying to refer to. For example, if I say MYOWNER.MYTABLE, Oracle will look for the MYTABLE table owned by MYOWNER. If you need multiple copies of all the same objects, this is the easiest level to group them in, which makes them the best level for having per developer copies of the database.
It is common to divide the two concepts manually: a schema can be locked out of logging in, and permissions can be granted to another user on its objects. This is something of a hassle, though, since there's no way to grant permissions across an entire schema; each object must be granted explicitly to some user or role. There's also no way to force users to create objects in a specific schema besides their own; permissions can only be granted to either create objects in the user's own schema or globally in any schema.
Complete aside: in PostgreSQL and SQL Server, schemas are only namespaces, not users.
Tablespace
Tablespaces are sets of files on disk that contain everything you need to store, including both data and the metadata (such table definitions). A single database can use multiple tablespaces, and different objects within a schema can even be on different tablespaces. A tablespace can be one or many files, but they're managed as one logical unit. Each schema has a default tablespace for its objects if a tablespace isn't specified when creating the object. Sharing them between databases is somewhere between impossible and unheard of.
In practice, it's common to not even bother with tablespaces and just leave the default configuration alone. The default is one tablespace named USERS with one file, and it's the default tablespace for all schemas in the database. If you change these at all, you usually set a default for each schema and then never think about it again until disk space becomes an issue.
Instance
You didn't ask about these specifically, but you'll need to understand them before we can talk about connecting to the database.
An instance is the actual process running on the server that listens for connections. Like databases, these require direct access to the database server to set up. You can have multiple or a single one on the server. It's common to have one per database.
An instance can be identified two ways: an SID or a service name. The SID identifies a single instance, while the service name is an alias that can refer to several instances. The details of how that works are usually unimportant; just know that you need to know of them to connect.
Connecting
To connect from a client, you need a connect descriptor. This is a jumbled string containing the host, port, and either SID or service name. They look like this, for example: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orclservice))). They can get more complicated, but that's the basic form. To use an SID instead of a service name, you would replace SERVICE_NAME=orclservice with SID=orclinstance. There's also a newer, more compact format called "EZ connect" that looks like this instead: myoracleserver:1521/orclservice; it only supports the basic parameters.
TNS is short for "Transparant Network Substrate," and it consists of the entire networking stack that is used to communicate with the database. You virtually never need to concern yourself with it as a whole.
What you encounter often is TNS names. TNS names are aliases to the connect descriptors. They're stored in a plain text file on the client machine, and they're typically global to the entire machine. Here's an example mapping that you mind find in the file: mydatabase=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl))). In my experience, most of the time you can actually avoid bothering with TNS names entirely and just use the connect descriptor directly.
A connect identifier is anything that can stand in for a connect descriptor. It can be a full connect descriptor, an EZ connect descriptor, a TNS name, or several other things. But generally, they identify a server and the particular database on it that you want to connect to.
With all that in mind, connections become a little more straightforward. Conceptually, they're pretty much the same as in other database. The thing that might be confusing about them is that you connect as a schema, as described earlier. The "username" is the schema name, and the schema can have a password or some other form of authentication associated with it. The connection string differs according to the client software, much like in any other database. For SQL*Plus (Oracle's command line client), connection strings look like this: [USERNAME]/[PASSWORD]#[connect identifier]. So if your user is MY_SCHEMA, the password is PASS, and the server is like above, it might look like
MY_SCHEMA/PASS#(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orclinstance)))
For a .NET application, it might look like
Data Source=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=myoracleserver)(PORT=1521))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl)));User Id=MY_SCHEMA;Password=PASS
which is pretty similar to any other database. Note that anywhere you see that nasty server information, you could replace that with any connect identifer (such as a TNS name).
As far as SQL Developer is concerned, a "connection" is really just a saved connection string. ODBC connects like any other database; you just need the right connection string and drivers.
Drivers
The drivers can be a pain point in Oracle, depending on language. I believe Java has some decent stand alone clients, but other languages generally depend on the binary version. The binary version does have an installer that puts the binaries on PATH, but the installer is pretty difficult to use and best avoided. When I can, I avoid installing the client and make use of what's called "instant client". Usually, if you can get the instant client binaries in a place where the app can find them, they just work. If not, then it's preferable to just prepend PATH in memory for your application than to modify it globally for your machine.
If you happen to be developing using .NET, use the ODP.NET provider on NuGet from Oracle. It's written in full .NET, eliminating the need to deal with native binaries.
Summary
So in short:
A database is part of the server set up
A schema is both a user and how you divide your database
A tablespace is the physical files that hold the database
TNS names are just a naming convenience on the client side
SID/Service Name are just names used when connecting
I find this arrangement far too complex, personally.

Difference between filter and a where clause

I'm always with my Access app..
As far as I know, when I execute a sql clause to my back end (accdb file), say
SELECT * FROM tbl WHERE id=1;
It gets filtered on the back end, then just one record is transmitted over the network.
My question is, when I open a form bounded with a query (no where clause) using a filter parameter, like
DoCmd.OpenForm "Form",,, strFilter
how many records are transmitted on the network? They get filtered like that sql clause or they get filtered locally, meaning a big pile of data has to be sent over the network?
I'm concerned about this because I have many subforms bounded to queries, then I open them in the main forms with filter parameter. And of course, the network here is not very good.
EDIT: The environment of my app is on a factory with no local server. All network/information thing is in company's headquarter 300km away, maybe a WAN.
Except upgrading to SQL server alike, do I have other solutions to make it more reliable? I've heard of something 'Citrix', I happened to have a 'Citrix Neighborhood Agent Program' in my sys tray, can it host my app to make it faster?
DoCmd.OpenForm "Form",,, strFilter
how many records are transmitted on the network?
As many as match your strFilter condition. So, if WHERE id=1 returns one row in the earlier SELECT query, and strFiler = "id=1", that OpenForm will open the form with that single row as its record source.
The WhereCondition parameter is also available for DoCmd.OpenReport, and operates the same way as with OpenForm, which you also may find useful.
Edit: You should have an index to support the WHERE criteria whether you build it into the query or do it "ad hoc" with OpenForm WhereCondition. With an index the database engine will read the index to find which rows match, then retrieve those rows. So retrieval will be more efficient, and therefore faster, than forcing the engine to read every row to determine which of them include matches.
When Jet/ACE requests data from a file server, the first thing it needs is the database header information, which has data structures describing the structure of the data file. This is information is requested once in your Access session, so it's really only an issue at startup.
When you then request a record, Jet/ACE uses the metadata it has about the file to request the relevant index pages for the table(s) involved, then uses those indexes to determine the minimum number of data pages to request.
With properly structured indexing and filters on primary keys the amount of data retrieved is actually quite minimal.
However, it's still going to be more than will allow proper response times across a WAN. Access was designed for use across a wired LAN, back in the days when the networking standard was 10BaseT (10Mbps). Anything less than that and you'll have problems. WiFi is right out, as well, but not because of bandwidth, but because of the unreliability of the connections.
When you need to support users remotely, the easiest solution is to host the Access application on a Windows Terminal Server. WTS is built on technology licensed from Citrix, so you'll often see the whole concept described as Citrix, but your default WTS setup is quite different from a Citrix installation. You have to pay extra for Citrix, and it gives you a lot of different features.
I've used WTS without Citrix in many environments and frankly can't see what the justification would be for Citrix (except when you have to support large numbers of remote users, i.e., in the range of 100 or more). WTS is installed on every Windows Server starting with Windows 2000 and is very easy to set up and configure.
The second easiest solution, in my opinion, is to upsize the back end to a server database and then rewrite for efficiency to insure you're using the server as much as possible and not pulling too much data across the wire.
A third solution would be Sharepoint, but I'm not experienced with that. It is definitely the direction that MS is pushing for Access apps in distributed setups, but it's quite complex and has a whole lot of features. I wouldn't recommend plunging into it without lots of preparation and significant corporate support.
Actually, with Access, there is not really a true back-end as there is with a bona-fide client-server engine like SQL Server or Oracle or Postgres. Access uses a shared-file architecture where the client program itself "owns" chunks of the file on disk, as distinct from a message-passing architecture where the client program sends requests for data to a back-end engine process running on a server where that process "owns" the data. With shared-file, all work occurs on the client, so it is possible for freight-train-loads of data to be brought across the wire if the database file resides on a different machine.
When you ask Access for data, it often reads a lot more data from the MDB file on disk and caches at the local client a lot more data than what your statement has asked for. Access tries to do this intelligently, anticipating your needs. "Now that I'm here", Access says, "I might as well make the expensive trip to disk worthwhile and grab a sh*tload of data". Don't get me wrong. I'm not an Access basher and have been using it for more than 10 years, from back in the days when LAN bandwidth was 10mbit/sec. Access is very good for some things. But Access can gobble up bandwidth like you wouldn't believe.
Read up on "keysets" in Access.
P.S. I am not the same Tim as the Tim who left you a comment.
Some useful links:
http://msdn.microsoft.com/en-us/library/dd942824(v=office.12).aspx
http://support.microsoft.com/kb/209126
http://support.microsoft.com/kb/112112
http://support.microsoft.com/kb/128808

Securing tables vs databases on a mutitool web site with confidential information

I am working on a site that multiple projects will be using to enter confidential subject information for various research projects. Project data access will be limited to specific users and tools. But certain core data will be referenced in and joined to the project tables (username, project meta-data, etc). The current plan is that each project will have mysql users with any combination of Select, Update, or Insert rights as needed. Plus an overall project Adminstrator user that can alter the shape of the project's tables that will only be used in phpadmin. We are using a Database object with some backtrace logic to determine what object passed it connection credentials and will only allow that connection to be used by the originating object (not impossible to get around by a dedicated programmer, but would throw up red flags in code review). And we are following standard procedure of moving the config out of the web root and keeping all credentials in config files instead of code. Of course there is an overall administrator but that has so many access rules and it's password is ludicrously long (we have a static yubikey + 10 char password).
What I want to know is whether to separate project data out to their own databases or should I put them in tables that have access limited to certain accounts? Setting user permissions on the Database or Table level seem to be about equitable in difficulty. There will be joins and other such operations between the core tables (meta-data usually) and the protected data. But joining across databases on the same server works fine, but I am uncertain about how the performance of intra-database joins compare to inter-database joins.
It doesn't matter if you put them in the same database or in different ones. You can implement a good (or a bad) security concept with both alternatives.
if you are using one database and you put data for different users in one table you will have to implement a lot of the access control in you application.
if you have separated the data completely in different tables (or even databases) you can easily use the access control of mysql. In this case I would go with separated databases, because it is more convenient when setting up a backup system or if you want to scale your application over more than one machine. But since you want to join across different databases you gonna lose some of these advantages so it doesn't really matter.

database synchronization - MS Access

I have an issue at the moment where multiple (same schema) access 2003 databases are used on laptops.
I need to find an automated way to synchronize the data into a central access database.
Data on the laptops is only appended to so update/delete operations wont be an issue.
Which tools will allow me to do this easily?
What factors will affect the decision on the best tool or solution?
It is possible to use the Jet replication built into Access, but I will warn you, it is quite flaky. It will also mess up your PK on whatever tables you do it on because it picks random signed integers to try and avoid key collisions, so you might end up with -1243482392912 as your next PK on a given record. That's a PITA to type in if you're doing any kind of lookup on it (like a customer ID, order number, etc.) You can't automate Access synchronization (maybe you can fake something like it by using VBA. but still, that will only be run when the database is opened).
The way I would recommend is to use SQL Server 2005/2008 on your "central" database and use SQL Server Express Editions as the back-end on your "remote" databases, then use linked tables in Access to connect to these SSEE databases and replication to sync them. Set up either merge replication or snapshot replication with your "central" database as the publisher and your SSEE databases as subscribers. Unlike Access Jet replication, you can control the PK numbering but for you, this won't be an issue as your subscribers will not be pushing changes.
Besides the scalability that SQL server would bring, you can also automate this using the Windows Synchronization manager (if you have synchronized folders, that's the annoying little box that pops up and syncs them when you logon/logoff), and set it up so that it synchronizes at a given interval, on startup, shutdown, or at a time of day, and/or when computer is idle, or only synchronizes on demand. Even if Access isn't run for a month, its data set can be updated every time your users connect to the network. Very cool stuff.
Access Replication can be awkward, and as you only require append queries with some checking, it would probably be best to write something yourself. If the data collected by each laptop cannot overlap, this may not be too difficult.
You will need to consider the primary keys. It may be best to incorporate the user or laptop name in the key to ensure that records relate correctly.
The answers in this thread are filled with misinformation about Jet Replication from people who obviously haven't used it and are just repeating things they've heard, or are attributing problems to Jet Replication that actually reflect application design errors.
It is possible to use the Jet
replication built into Access, but I
will warn you, it is quite flaky.
Jet Replication is not flakey. It is perfectly reliable when used properly, just like any other complex tool. It is true that certain things that cause no problems in a non-replicated database can lead to issues when replicated, but that stands to reason because of the nature of what replication by any database engine entails.
It will also mess up your PK on
whatever tables you do it on because
it picks random signed integers to try
and avoid key collisions, so you might
end up with -1243482392912 as your
next PK on a given record. That's a
PITA to type in if you're doing any
kind of lookup on it (like a customer
ID, order number, etc.)
Surrogate Autonumber PKs should never be exposed to users in the first place. They are meaningless numbers used for joining records behind the scenes, and if you're exposing them to users IT'S AN ERROR IN YOUR APPLICATION DESIGN.
If you do need sequence numbers, you'll have to roll your own and deal with the issue of how to prevent collisions between your replicas. But that's an issue for replication in any database engine. SQL Server offers the capability of allocating blocks of sequence numbers for individual replicas at the database engine level and that's a really nice feature, but it comes at the cost of increased administrative overhead from maintaining multiple SQL Server instances (with all the security and performance issues that entails). In Jet Replication, you'd have to do this in code, but that's hardly a complicated issue.
Another alternative would be to use a compound PK, where one column indicates the source replica.
But this is not some flaw in the Replication implementation of Jet -- it's an issue for any replication scenario with a need for meaningful sequence numbers.
You can't automate Access
synchronization (maybe you can fake
something like it by using VBA. but
still, that will only be run when the
database is opened).
This is patently untrue. If you install the Jet synchronizer you can schedule synchs (direct, indirect or Internet synchs). Even without it, you could schedule a VBScript to run periodically and do the synchronization. Those are just two methods of accomplishing automated Jet synchroniziation without needing to open your Access application.
A quote from MS documentation:
Use Jet and Replication Objects
JRO is really not the best way to manage Jet Replication. For one, it has only one function in it that DAO itself lacks, i.e., the ability to initiate an indirect synch in code. But if you're going to add a dependency to your app (JRO requires a reference, or can be used via late binding), you might as well add a dependency on a truly useful library for controlling Jet Replication, and that's the TSI Synchronizer, created by Michael Kaplan, once the world's foremost expert on Jet Replication (who has since moved onto internationalization as his area of concentration). It gives you full programmatic control of almost all the replication functionality that Jet exposes, including scheduling synchs, initiating all kinds of synchronization, and the much-needed MoveReplica command (the only legal way to move or rename a replica without breaking replication).
JRO is one of the ugly stepchildren of Microsoft's aborted ADO-Everywhere campaign. Its purpose is to provide Jet-specific functionality to supplement what is supported in ADO itself. If you're not using ADO (and you shouldn't be in an Access app with a Jet back end), then you don't really want to use JRO. As I said above, it adds only one function that isn't already available in DAO (i.e., initiating an indirect synch). I can't help but think that Microsoft was being spiteful by creating a standalone library for Jet-specific functionality and then purposefully leaving out all the incredibly useful functions that they could have supported had they chosen to.
Now that I've disposed of the erroneous assertions in the answers offered above, here's my recomendation:
Because you have an append-only infrastructure, do what #Remou has recommended and set up something to manually send the new records whereever they need to go. And he's right that you still have to deal with the PK issue, just as you would if you used Jet Replication. This is because that's necessitated by the requirement to add new records in multiple locations, and is common to all replication/synchronization applications.
But one caveat: if the add-only scenario changes in the future, you'll be hosed and have to start from scratch or write a whole lot of hairy code to manage deletes and updates (this is not easy -- trust me, I've done it!). One advantage of just using Jet Replication (even though it's most valuable for two-way synchronizations, i.e., edits in multiple locations) is that it will handle the add-only scenario without any problems, and then easily handle full merge replication should it become a requirement in the future.
Last of all, a good place to start with Jet Replication is the Jet Replication Wiki. The Resources, Best Practices and Things Not to Believe pages are probably the best places to start.
You should read into Access Database Replication, as there is some information out there.
But I think that in order for it to work correctly with your application, you will have to roll out a custom made solution using the methods and properties available for that end.
Use Jet and Replication Objects (JRO) if you require programmatic control over the exchange of data and design information among members of the replica set in Microsoft Access databases (.mdb files only). For example, you can use JRO to write a procedure that automatically synchronizes a user's replica with the rest of the set when the user opens the database. To replicate a database programmatically, the database must be closed.
If your database was created with Microsoft Access 97 or earlier, you must use Data Access Objects (DAO) to programmatically replicate and synchronize it.
You can create and maintain a replicated database in previous versions of Microsoft Access by using DAO methods and properties. Use DAO if you require programmatic control over the exchange of data and design information among members of the replica set. For example, you can use DAO to write a procedure that automatically synchronizes a user's replica with the rest of the set when the user opens the database.
You can use the following methods and properties to create and maintain a replicated database:
MakeReplica method
Synchronize method
ConflictTable property
DesignMasterID property
KeepLocal property
Replicable property
ReplicaID property
ReplicationConflictFunction property
Microsoft Jet provides these additional methods and properties for creating and maintaining partial replicas (replicas that contain a subset of the records in a full replica):
ReplicaFilter property
PartialReplica property
PopulatePartial method
You should definitely read the Synchronizing Data part of the documentation.
I used replication in a00 for years, until forced to upgrade to a07 (when it went away). The most problematic issue we ran into, at the enterprise level, was managing the CONFLICTS. If not managed timely, or there are too many, users get frustrated and the data becomes unreliable.
Replication did work well when our remote sites were not always connected to the internet. This allowed them to work with their data, and synchronize when they could. At least twice daily.
We install a separate database on the remote computers that managed the synchronization, so the user only had to click an icon on their desktop to evoke the synchronization.
The user had a separate button to push/pull in feeds off a designated FTP file that would update from the Legacy systems.
This process worked quite well, as we had 30 of these "nodes" working around the country, managing their data and updating to the FTP servers.
If you are seriously considering this path, let me know and I can send you my documentation.
You can write your own synchronization software that connects to the laptop selects the diff from it's db and inserts it to the master.
It is depends on your data scheme how easy this operation will be.
(if you have many tables with FKs... you will need to do it smartly).
I think it will be the most efficient if you write it yourself.
Automating this kind of behavior is called replication, and Accesss Supports that apparently, but I've never seen it implemented.
As I guess most of the time the laptop is not connected to the main DB it is not a good idea anyway (to replicate data).
if you will look for a 3rd party tool to do it - look for something that can easily do the diff between the tables before copying, and can do it incrementally of course.
FWIW:
Autonumbers. I agree with David - they should never be exposed. To remove that temptation, I use a Random autonumber.
Replication. I used this extensively some years back, with scheduled syncs, and using GUIDs as the PK. I repeatedly found that any hiccups over the network corrupted the replicas, with the result that I had to salvage data, and re-issue replicas. Painful!