I have read a few/lots of things on this but they don't seem to help much.
I have an App (it's called "TieUp" but that is irrelevant) I run it manually daily to collate data from several locations.
It is using as sources:
A) Data from a remote SOAP source and loaded into an in-memory TClientDataset via an XMLtransform setup.
B) CSV files downloaded daily and loaded into an in-memory TClientDataset
C) A Mysql Database on the same computer as the program (it's a restored backup of the live source)
D) A remote MS-SQL (SQLServer 2008) database
E) A Mysql Database on a remote server
Data is only read from sources A, B, C and D
Data source E is updated with the consolidated data.
There are between 800 to 2000 records daily so the datasets are not vast although the target (E) has grown to around 150,000 and increasing daily.
I can normally run this all happily and everything works as expected if a little slowly because of all the individual remote lookups to the MS-SQL system) but some days it really screws up and the error is always "Catastrophic Failure!".
The failure does not occur during any particular phase or operation that I can see. The steps are:
1) Get the SOAP(A) data first.
2) Tie in with CSV/In Memory data(B).
3) Lookup References data on Sources C and D to collate
4) Write the consolidated data to source E
After reading in the data into the in memory datasets every thing is In TClientDatasets accesses via DatasetProviders linked to TSQLQueries (they all on the same servers currently but I did it that way to keep some flexibility in future where it might goes true three tier). All queries are contained within the SQLQuery components as they are actually quite simple - it's just a matter of tying things together.
I am using completely standard components from Delphi 2009 Enterprise. All updates and database update packs have been applied. Each data source has its own DataModule these are auto created at startup
There is obviously quite a lot of data access going on here but when it crashes (with catastrophic failure) It gets stuck, completely stuck. Windows can't end the task from the normal "TieUp has stopped working" I have to go to the process and kill it.
There is so much going on and as this only happens once a week or so I really don't know where to start looking.
The reasons for asking the question is twofold: 1) is that I am trying to eliminate any manual stuff and fully automate it, but I can't rely on it if if bombs every week or so. 2) if it happens in the update phase to E - I have to manually delete the new records for the day and start again as I do not have (or haven't written yet) a mechanism to restart from a random point and I would still have to query the DB manually to establish that point for certain.
My next step is to install Delphi on another computer and always run it under the debugger until I can catch it, if it does not freeze first. But that introduces yet another different network connection (instead of the local host one).
So: "Is there a definite answer?" or what is the most likely offending component/connection? Where is the favoured place to start looking?
Thanks in advance...
Related
I am new to a new role at a company where they are using MS Access with a MySQL db which is running in server that's physically in our office behind our private network. I have been hired to develop an entire new application to bring the company up to modern standards. As we move features/modules to the new Angular/NodeJs App I am writing, users still need to utilize the UI provided by MS Access to the new production database that will be on AWS Lightsail.
However, when I change the configurations of Ms Access, OBDC connections to point to the AWS Lightsail MySQL Db, everything(reports especially) in the MS Access UI becomes slower than when it was being pointed to the MySQL Db here in office in-network.
I am going to the "Linked Table Manager" and changing the "Connection String".
Somewhere I read I should make sure SSLMODE is disabled to remove any performance issues.
DSN=AWS_Dev;DATABASE=ECSDataTables;PORT=3306;SERVER=IP_ADDRESS;SSLMODE=DISABLED;
I went through the normal "ODBC Data Source Administrator" in Windows and added the MySQL AWS host, user/pass as normal.
I have done extensive research and have found several sources, but none are really helping.
I have been asked not to spend too much time trying to fix/optimize anything in MS Access as my focus should be on the new application, but it's hard to believe that a simple switch of MySQL database can have such impact. In the new Angular/NodeJs application, everything runs very fast, so I know it's not the AWS MySQL db or anything.
Am I missing something, any configurations I should be doing in Ms Access? I have not used VB in about a decade, so I am hoping something can be done without the need of too much technical background in this matter.
Thank You.
Well, the issue is that your local area network (LAN) is about 10 times, or more faster then your internet connection.
Your low cost office network is very likely to be a 1 gig bit network. (100 base T is rare).
However, your internet high speed connection is likely say 10 mbits. So, you going from 1000 to 10 - that is 100 times slower. So, 3 seconds now becomes 300 seconds.
I mean, with such a slower connection speed, then no surprise should exist here.
What you can do is for any report that is a complex join of client side sql is convert the sql query to a server side view, link to that view. Now use that view as the base source for the report. And of course existing VBA filers that you always use (right???) to launch a report will now only pull the data it needs down the network pipe. Access reports (or forms) only pull down what you ask - not the whole table. So, any filter you have (use the where clause of the open report command) will be respected. So, you either have to pull less data, or simply find something with a similar speed rating as your local area network (and such high speed internet is rare).
The LAN vs WAN concept and speed issue is outlined in this article:
http://www.kallal.ca//Wan/Wans.html
While the above article is very old, the speed differences of the internet are about 10x faster today, but so is the local area that's gone from 100 baseT to 1 gig bit base.
So, things are slower because you are working with a VASTLY slower connection speed. Slower is slower!!!
Edit
While as noted, access will only pull what you ask, the case where access client does a poor job is sql queries that involve multiple tables - often the client will mess up what it sends server side. As noted, the solution in this case is to adopt views server side. This means you move the client side query that drives the report to a view, and link to that view. You not gain much performance for a single table query, but for any report based on complex (multi-table joins), then using a view will force the sql and "join work" to occur sql server side, and this can result in huge performance gains.
Well this is a case where limited knowledge just produces worst results than the expected ones.
Over the years top DBAs just "hate" Ms Access... they just see only problems,issues you name it ...the end sentence is "switch to a real Database engine".
Well this has created a faulty assumption that MsSQL, MySQL,Oracle, PostGreSQL and the rest database engines are somewhat a "magic pill"...you just switch the BE to one of the above DBE and all your problems will get resolved...just like that.
DBE --Database Engine (if you would like to call somewhat else feel free)
WRONG
Ms Access follows a different philosophy from the DBE and it does its job damn well given all its shortcoming and the major fact that is a file based DBE.
Switching to another DBE will give amazing performance IF and ONLY IF you respect the fact that you are not working with Access ....just don't treat e.g. MySQL as your file repository and DON'T just link the tables and expect everything to go well...
Want to keep blaming Access ...just switch over to another platform (.NET,PHP,Js , Java...make your pick) ...and do a small application that pulls ALL of your data in a single go like you do with Access . it will certainly crash or go Not responding...
So stop blaming Access ...start reading on how to make the most of two worlds and i am pretty sure that the results will amaze you....but again i must stress out that this is not a "magic pill" solution ...it involves a LOT of work ...planning,data manipulation,normalization,code changes and above all change of philosophy..
I would recommend starting the journey by picking this book : https://www.amazon.com/Microsoft-Access-Developers-Guide-Server/dp/0672319446 ( i don't want complains about its Old and MsSQL ...just read first and complain later)
Also take a look at an old benchmark alike video i made some years ago : https://www.linkedin.com/posts/tsgiannis_a-small-demo-of-connecting-ms-access-fe-to-activity-6392696633531858944-dsuU
Last but not least....years ago i was making some tests to see what the "magic pill" would do to my company's applications....the simplest test of all...
A simple table with few fields but with around 8 millions records...just display it
Access BE (local)--> It would run in 1-2 seconds...that's fast
Access BE (Network share)--> It would run in a few seconds...not so fast but it was usable
MSSQL BE (linked table)--> somethimes it get the results sometimes it wouldn't....slow...really slow ..like you make a coffee and go for a small walk.
MySQL BE (linked table)--> it never finished...timeout of "Not Responding"
PostGreSQL BE (linked table)--> it never finished...timeout of "Not Responding"
So stop blaming Access...start working and get amazed....
I am currently using VB6 to connect to a MS access DB using DAO and I’m experiencing a very noticeable speed reduction when a 2nd user connects to the Database.
Here are the steps to reproduce:
Open the Database from computer A by logging into the software
Add records to the database via the software (takes about .4 seconds)
A second user logs into the software (Computer B), ie: this opens the database, displays todays transactions, but the user does nothing else
On Computer A, repeat the operation of adding records, now the operation takes approximately 6 seconds
Further info…
the operation continues to take aprox 6 seconds, even after Computer B logs out of the software
if you close and reopen the application from Computer A the operation returns to taking only .4 seconds to execute!
Any help would be greatly appreciated!
Thanks!
That is the way MS Access works. While it kind of supports multiple users, and kind of supports placing the DB on a file share so multiple PCs can access it, it does neither really well. And if you are doing both (multi-user and over a network to a file share) then I feel for your pain.
The answer is to run the upgrade wizard and convert this to an MS SQL Server instance. MS SQL Server Express edition is a good choice to replace Acess in the case. Note that you can still keep all of your code and reports etc you have in Access, only the data needs to be moved.
Just to be clear on the differences, in MS Access when you read data from the database, all of the data required to perform your query is read from a file by your program, no server-side processing is done. If that data resides on a network, you are pulling that data across your network. If there are multiple users, you have an additional overhead of locking. Each users program/process effectively dialogs with the program/process of the other users via file I/O (writing lock info into the networked file or files). And if the network I/O times out or has other issues then those files can become corrupted.
In SQL Server, it is the SQL Server engine that manages the data requests and only returns the data required. It also manages the locks and can detect when a client has disconnected or timed out to clean up, which reduces issues caused by multiple users on a network.
We had this problem with our VB3 / Jet DB 2.5 application when we transitioned to using newer file servers.
The problem is "opportunistic locking" : http://support.microsoft.com/kb/296264?wa=wsignin1.0
Albert is probably describing the same thing ; the server will permit one client exclusive access of a file, but when another chimes in, this exclusive access will "thrash" between them, causing delays as the client with the oplock flushes all it's local cache to the server before the other client can access the file.
This may also be why you're getting good performance with one client - if it takes an oplock, it can cache all the data locally.
This can also cause some nasty corruption if one of your clients has a power failure or drops off the network, because this flushing of the local cache to the server can be interrupted.
You used to be able to disable this (on the client - so you need to service ALL the clients) on Windows 2000 and XP as per the article, but after Vista SP2 it seems to be impossible.
The comments about not using Access / JetDB as a multi-user database are essentially correct - it's not a good architectural choice, especially in light of the above. DAO is also an obsolete library, even in the obsolete VB6. ADODB is a better choice for VB6, and should allow you some measure of database independence depending on how your app is written.
Since as you pointed out you get decent performance with one user on the system, then obviously your application by nature is not pulling too much data over the network, and we can't blame network speed here.
In fact what is occurring is the windows file share system is switching from single file share mode into multi-share file mode. This switching file modes causes a significant delay. And this also means that the 2nd or more user has to attempt to figure out and setup locks on the file.
To remove this noticable delay simply at the start of your application open what we call a persistent connection. A persistent connection is simply something that forces the network connection to remain open at all times, and therefore this significant delay in switching between two file modes for file share is eliminated. You now find that performance with two users should be the same as one (assuming one user is idle and not increasing network load). So at application startup time, open a back end table to a global var and KEEP that table open at all times.
I've been asked for a quick turn around on this. The group I'm assisting has a .MDB database where offsite workers that don't have internet all the time. Thus, way back the team implemented an Access DB which allows for synchronization.
As their team grew bigger they started running into the following issues:
Remote synching – when an user tries to synch from a worksite, more often than not, the database will crash either due to loss of wireless signal, program timing out, or Inspector manually shutting down due to time (i.e., 30 or more minutes)
Multiple synchers – we are unable to synch multiple at one time (there are currently 34 users in 3 different territories). If someone is synching and another person tries to synch at the same time, the second user will end up with an error message. They will have to shut down their DB and try to synch at a later time.
Incomplete synchs – sometimes when an worker synch’s his/her DB, not all the line items will copy over to the Master file which can cause confusion during review.
Is there any work arounds or items I can look into to resolve these?
I have little resources and time so anything involving a new server might not work.
THanks
It sounds as though you are mainly adding new data from different field operatives, rather than everyone updating existing data, if this is the case then that's good and you could try the following:
Ensure all the tables have "Replication ID's" for the Primary Keys as this will ensure no two operatives create conflicting records.
The synchronisation process should then be amended to take a snapshot of said table/tables to a .txt file on the operatives machine and then this file transferred back to the source machine.
Then at the end of the day or more often if required, the master copy should be setup to import the new data from all the text files it has received, as there will be no conflicting Primary Keys you should be ok, just remember to insert only those where the Primary Key is not already in the table.
Hope all that makes sense : )
I'm always with my Access app..
As far as I know, when I execute a sql clause to my back end (accdb file), say
SELECT * FROM tbl WHERE id=1;
It gets filtered on the back end, then just one record is transmitted over the network.
My question is, when I open a form bounded with a query (no where clause) using a filter parameter, like
DoCmd.OpenForm "Form",,, strFilter
how many records are transmitted on the network? They get filtered like that sql clause or they get filtered locally, meaning a big pile of data has to be sent over the network?
I'm concerned about this because I have many subforms bounded to queries, then I open them in the main forms with filter parameter. And of course, the network here is not very good.
EDIT: The environment of my app is on a factory with no local server. All network/information thing is in company's headquarter 300km away, maybe a WAN.
Except upgrading to SQL server alike, do I have other solutions to make it more reliable? I've heard of something 'Citrix', I happened to have a 'Citrix Neighborhood Agent Program' in my sys tray, can it host my app to make it faster?
DoCmd.OpenForm "Form",,, strFilter
how many records are transmitted on the network?
As many as match your strFilter condition. So, if WHERE id=1 returns one row in the earlier SELECT query, and strFiler = "id=1", that OpenForm will open the form with that single row as its record source.
The WhereCondition parameter is also available for DoCmd.OpenReport, and operates the same way as with OpenForm, which you also may find useful.
Edit: You should have an index to support the WHERE criteria whether you build it into the query or do it "ad hoc" with OpenForm WhereCondition. With an index the database engine will read the index to find which rows match, then retrieve those rows. So retrieval will be more efficient, and therefore faster, than forcing the engine to read every row to determine which of them include matches.
When Jet/ACE requests data from a file server, the first thing it needs is the database header information, which has data structures describing the structure of the data file. This is information is requested once in your Access session, so it's really only an issue at startup.
When you then request a record, Jet/ACE uses the metadata it has about the file to request the relevant index pages for the table(s) involved, then uses those indexes to determine the minimum number of data pages to request.
With properly structured indexing and filters on primary keys the amount of data retrieved is actually quite minimal.
However, it's still going to be more than will allow proper response times across a WAN. Access was designed for use across a wired LAN, back in the days when the networking standard was 10BaseT (10Mbps). Anything less than that and you'll have problems. WiFi is right out, as well, but not because of bandwidth, but because of the unreliability of the connections.
When you need to support users remotely, the easiest solution is to host the Access application on a Windows Terminal Server. WTS is built on technology licensed from Citrix, so you'll often see the whole concept described as Citrix, but your default WTS setup is quite different from a Citrix installation. You have to pay extra for Citrix, and it gives you a lot of different features.
I've used WTS without Citrix in many environments and frankly can't see what the justification would be for Citrix (except when you have to support large numbers of remote users, i.e., in the range of 100 or more). WTS is installed on every Windows Server starting with Windows 2000 and is very easy to set up and configure.
The second easiest solution, in my opinion, is to upsize the back end to a server database and then rewrite for efficiency to insure you're using the server as much as possible and not pulling too much data across the wire.
A third solution would be Sharepoint, but I'm not experienced with that. It is definitely the direction that MS is pushing for Access apps in distributed setups, but it's quite complex and has a whole lot of features. I wouldn't recommend plunging into it without lots of preparation and significant corporate support.
Actually, with Access, there is not really a true back-end as there is with a bona-fide client-server engine like SQL Server or Oracle or Postgres. Access uses a shared-file architecture where the client program itself "owns" chunks of the file on disk, as distinct from a message-passing architecture where the client program sends requests for data to a back-end engine process running on a server where that process "owns" the data. With shared-file, all work occurs on the client, so it is possible for freight-train-loads of data to be brought across the wire if the database file resides on a different machine.
When you ask Access for data, it often reads a lot more data from the MDB file on disk and caches at the local client a lot more data than what your statement has asked for. Access tries to do this intelligently, anticipating your needs. "Now that I'm here", Access says, "I might as well make the expensive trip to disk worthwhile and grab a sh*tload of data". Don't get me wrong. I'm not an Access basher and have been using it for more than 10 years, from back in the days when LAN bandwidth was 10mbit/sec. Access is very good for some things. But Access can gobble up bandwidth like you wouldn't believe.
Read up on "keysets" in Access.
P.S. I am not the same Tim as the Tim who left you a comment.
Some useful links:
http://msdn.microsoft.com/en-us/library/dd942824(v=office.12).aspx
http://support.microsoft.com/kb/209126
http://support.microsoft.com/kb/112112
http://support.microsoft.com/kb/128808
We have an ASP.NET web application hosted by a web farm of many instances using SQL Server 2008 in which we do aggregation and pre-processing of data from multiple sources into a format optimised for fast end user query performance (producing 5-10 million rows in some tables). The aggregation and optimisation is done by a service on a back end server which we then want to distribute to multiple read only front end copies used by the web application instances to facilitate maximum scalability.
My question is about the best way to get this data from a back end database out to the read only front end copies in such a way that does not kill their performance during the process. The front end web application instances will be under constant high load and need to have good responsiveness at all times.
The backend database is constantly being updated so I suspect that transactional replication will not be the best approach, as the constant stream of updates to the copies will hurt their performance.
Staleness of data is not a huge issue so snapshot replication might be the way to go, but this will result in poor performance during the periods of replication.
Doing a drop and bulk insert will result in periods with no data for user queries.
I don't really want to get into writing a complex cluster approach where we drop copies out of the cluster during updating - is there something along these lines that we can do without too much effort, or is there a better alternative?
There is actually a technology built into SQL Server 2005 (and 2008) that is designed to address this kind of issues. Service Broker (I'll refer further as SSB). The problem is that it has a very steep learning curve.
I know MySpace went public how uses SSB to manage their park of SQL Servers: MySpace Uses SQL Server Service Broker to Protect Integrity of 1 Petabyte of Data. I know of several more (major) sites that use similar patterns but unfortunately they have not gone public so I cannot refer names. I was personally involved with some projects around this technology (I am a former member of the SQL Server team).
Now bear in mind that SSB is not a dedicate data transfer technology like Replication. As such you will not find anyhting similar to the publishing wizards and simple deployment options of Replication (check a table and it gets transferred). SSB is a reliable messaging technology and as such its primitives stop at the level of message exchange, you would have to write the code that leverages the data change capture, packs it as messages and also the unpacking of message into relational tables at destination.
Why still some companies preffer SSB over Replication at a task like you describe is because SSB has a far better story when it comes to reliability and scalability. I know of projects that exchange data between 1500+ sites, far beyond the capabilities of Replication. SSB is also abstracted from the physical topology: you can move databases, rename machines, rebuild servers all without changing the application. Because data flow occurs over logical routes the application can addapt on-the-fly to new topologies. SSB is also resilient to long periods of disocnnect and downtime, being capable of resuming the data flow after hours, days and even months of disconnect. High troughput achieved by engine integration (SSB is part of the SQL engine itself, is not a collection of sattelite applications and processes like Replication) means that the backlog of changes can be processes on reasonable times (I know of sites that are going through half a million transactions per minute). SSB applications typically rely on internal Activation to process the incomming data. SSB also has some unique features like built-in load balancing (via routes) with sticky session semantics, support for deadlock free application specific correlated processing, priority data delivery, specific support for database mirroring, certificate based authentication for cross domain operations, built-in persisted timers and many more.
This is not a specific answer 'how to move data from table T on server A to server B'. Is more a generic technology on how to 'exhange data between server A and server B'.
I've never had to deal with this scenario before but did come up with a possible solution for this. Basically, it would require a change in your main database structure. Instead of storing the data, you would keep records of modifications of this data. Thus, if a record is added, you store "Table X, inserted new record with these values: ..." With modifications, just store the table, field and changed value. With deletions, just store which record is deleted. Every modification will be stored with a timestamp.
Your client systems would keep their local copies of the database and will regularly ask for all database modifications after a certain date/time. You then execute those modifications on the local database and it will be up-to-date again.
And the back-end? Well, it would just keep a list of modifications and perhaps a table with the base data. Keeping just the modifications also means you're keeping track of history, allowing you to ask the system what it looked like a year ago.
How well this would perform depends on the number of modifications on the back-end database. But if you request the changes every 15 minutes, it shouldn't be that much data every time.
But again, I never had the chance to work this out in a real application so it's still a theoretic principle for me. It seems fast but a lot of work will be required.
Option 1: Write an app to transfer the data using row level transactions. It might take longer but would result in no interruption of the site using the data because the rows are there before and after the read occurs, just with new data. This processing would happen on a separate server to minimize load.
In sql server 2008 you can set READ_COMMITTED_SNAPSHOT to ON to ensure that the row being updated is not causing blocking.
But basically all this app does is read the new data as it is available out from one database and into the other.
Option 2: Move the data (tables or entire database) from the aggregation server to the front-end server. Automate this if possible. Then switch your web application to point to the new database or tables for future requests. This works but requires control over the web app, which you may not have.
Option 3: If you were talking about a single table (or this could work with many) what you can do is a view swap. So you write your code against a sql view which points to table A. You do you work on Table B and when it's ready, you update the view to point to Table B. You can even write a function that determines the active table and automate the whole swap thing.
Option 4: You might be able to use something like byte-level replication of the server. That sounds scary though. Which is basically copying the server from point A to point B exactly down to the very bytes. It's mostly used in DR situations which this sounds like it could be a kinda/sorta DR situation, but not really.
Option 5: Give up and learn how to sell insurance. :)