MSAccess: SQL WITH (NOLOCK) vs Select Distinct? - ms-access

I has an Access app with multiple users. I have a task that seems to be causing an occasional collision:
Users are fetching data from a linked SQL Server table to their local DB, dirt simple:
SELECT dbo_Item.*
FROM dbo_Item
WHERE (((dbo_Item.INVID)=4892));
Some of these fetches can involve 350K+ records and take >~2 minutes to complete
I currently have db.QueryTimeout = 0
If two users are attempting to hit the remote table at the same time, we will occasionally see a SQL/Network error. They are NOT attempting to access the same dataset.
I doubt it's a timeout issue as the ERR is thrown within 1-2 minutes
The owner of the SQL Server-side data suggested adding a (NOLOCK) hint to my Access queries. I am not seeing that as an option, but I did find a random suggestion that performing a '"SELECT DISTINCT"' would not lock the table - is this true?
Any suggestions to avoid record locking issue - or ideas what else to look for?
Thanks!

Are they searching through 350k records or copying 350k records?
If searching through 350k records, an index on INVID may be useful.
If copying 350k record, DISTINCT could reduce this by removing duplicates, but if there are no duplicate entries, it doesn't help.
NOLOCK would help if other people were accessing the same data, essentially leaving the data open for others to modify, but your description is they are accessing different data. If from the same table, NOLOCK could help.

I decided to use a passthru SELECT query with (NOLOCK) applied. Then, I created a make_table query based on the passthru SELECT query. Seems to work just fine.

Related

MySQL JOIN Keeps Timing Out

I am currently trying to run a JOIN between two tables in a local MySQL database and it's not working. Below is the query, I am even limiting the query to 10 rows just to run a test. After running this query for 15-20 minutes, it tells me "Error Code" 2013. Lost connection to MySQL server during query". My computer is not going to sleep, and I'm not doing anything to interrupt the connection.
SELECT rd_allid.CreateDate, rd_allid.SrceId, adobe.Date, adobe.Id
FROM rd_allid JOIN adobe
ON rd_allid.SrceId = adobe.Id
LIMIT 10
The rd_allid table has 17 million rows of data and the adobe table has 10 million. I know this is a lot, but I have a strong computer. My processor is an i7 6700 3.4GHz and I have 32GB of ram. I'm also running this on a solid state drive.
Any ideas why I cannot run this query?
"Why I cannot run this query?"
There's not enough information to determine definitively what is happening. We can only make guesses and speculations. And offer some suggestions.
I suspect MySQL is attempting to materialize the entire resultset before the LIMIT 10 clause is applied. For this query, there's no optimization for the LIMIT clause.
And we might guess that there is not a suitable index for the JOIN operation, which is causing MySQL to perform a nested loops join.
We also suspect that MySQL is encountering some resource limitation which is causing the session to be terminated. Possibly filling up all space in /tmp (that usually throws an error, something like "invalid/corrupted myisam table '#tmpNNN'", something of that ilk. Or it could be some other resource constraint. Without doing an analysis, we're just guessing.
It's possible MySQL wrote something to the error log (hostname.err). I'd check there.
But whatever condition MySQL is running into (the answer to the question "Why I cannot run this query")
I'm seriously questioning the purpose of the query. Why is that query being run? Why is returning that particular resultset important?
There are several possible queries we could execute. Some of those will run a long time, and some will be much more performant.
One of the best ways to investigate query performance is to use MySQL EXPLAIN. That will show us the query execution plan, revealing the operations that MySQL will perform, and in what order, and indexes will be used.
We can make some suggestions as to some possible indexes to add, based on the query shown e.g. on adobe (id, date).
And we can make some suggestions about modifications to the query (e.g. adding a WHERE clause, using a LEFT JOIN, incorporate inline views, etc. But we don't have enough of a specification to recommend a suitable alternative.
You can try something like:
SELECT rd_allidT.CreateDate, rd_allidT.SrceId, adobe.Date, adobe.Id
FROM
(SELECT CreateDate, SrceId FROM rd_allid ORDER BY SrceId LIMIT 1000) rd_allidT
INNER JOIN
(SELECT Id FROM adobe ORDER BY Id LIMIT 1000) adobeT ON adobeT.id = rd_allidT.SrceId;
This may help you get a faster response times.
Also if you are not interested in all the relation you can also put some WHERE clauses that will be executed before the INNER JOIN making the query faster also.

Will a MySQL SELECT statement interrupt INSERT statement?

I have a mysql table that keep gaining new records every 5 seconds.
The questions are
can I run query on this set of data that may takes more than 5 seconds?
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
I'll go over your questions and some of the comments you added later.
can I run query on this set of data that may takes more than 5 seconds?
Can you? Yes. Should you? It depends. In a MySQL configuration I set up, any query taking longer than 3 seconds was considered slow and logged accordingly. In addition, you need to keep in mind the frequency of the queries you intend to run.
For example, if you try to run a 10 second query every 3 seconds, you can probably see how things won't end well. If you run a 10 second query every few hours or so, then it becomes more tolerable for the system.
That being said, slow queries can often benefit from optimizations, such as not scanning the entire table (i.e. search using primary keys), and using the explain keyword to get the database's query planner to tell you how it intends to work on that internally (e.g. is it using PKs, FKs, indices, or is it scanning all table rows?, etc).
if SELECT statement takes more than 5s, will it affect the scheduled INSERT statement?
"Affect" in what way? If you mean "prevent insert from actually inserting until the select has completed", that depends on the storage engine. For example, MyISAM and InnoDB are different, and that includes locking policies. For example, MyISAM tends to lock entire tables while InnoDB tends to lock specific rows. InnoDB is also ACID-compliant, which means it can provide certain integrity guarantees. You should read the docs on this for more details.
what happen when INSERT statement invoked while SELECT is still running, will SELECT get the newly inserted records?
Part of "what happens" is determined by how the specific storage engine behaves. Regardless of what happens, the database is designed to answer application queries in a way that's consistent.
As an example, if the select statement were to lock an entire table, then the insert statement would have to wait until the select has completed and the lock has been released, meaning that the app would see the results prior to the insert's update.
I understand that locking database can prevent messing up the SELECT statement.
It can also put a potentially unacceptable performance bottleneck, especially if, as you say, the system is inserting lots of rows every 5 seconds, and depending on the frequency with which you're running your queries, and how efficiently they've been built, etc.
what is the good practice to do when I need the data for calculations while those data will be updated within short period?
My recommendation is to simply accept the fact that the calculations are based on a snapshot of the data at the specific point in time the calculation was requested and to let the database do its job of ensuring the consistency and integrity of said data. When the app requests data, it should trust that the database has done its best to provide the most up-to-date piece of consistent information (i.e. not providing a row where some columns have been updated, but others yet haven't).
With new rows coming in at the frequency you mentioned, reasonable users will understand that the results they're seeing are based on data available at the time of request.
All of your questions are related to locking of table.
Your all questions depend on the way database is configured.
Read : http://www.mysqltutorial.org/mysql-table-locking/
Perform Select Statement While insert statement working
If you want to perform a select statement during insert SQL is performing, you should check by open new connection and close connection every time. i.e If I want to insert lots of records, and want to know that last record has inserted by selecting query. I must have to open connection and close connection in for loop or while loop.
# send a request to store data
insert statement working // take a long time
# select statement in while loop.
while true:
cnx.open()
select statement
cnx.close
//break while loop if you get the result

Two MySQL requests at the same time - Performance issue

I have a MySQL server with many innodb tables.
I have a background script that does A LOT a delete/insert with one request : it deletes many millions of rows from table 2, then insert many millions of rows to table 2 using data from table 1 :
INSERT INTO table 2 (date)
SELECT date from table 1 GROUP BY date
(The request is actually more complex but it is to shown what kind of request I am doing).
At the same time, I am going to run a second background script, that does about a million INSERT or UPDATE requests, but separately (I mean, I execute the first update query, then I execute an insert query, etc...) in table 3.
My issue is that when a script is running, it is fast, like let's say it takes 30minutes each, so 1h total. But when the two scripts are running at the same time, it is VERY slow, like it will take 5h, instead of 1h.
So first, I would like to know what can cause this ? Is it because of IO performance ? (like mysql is writing in two different tables so it is slow to switch between the two ?)
And how could I fix this ? If I could say that the big INSERT query is paused while my second background script is running, it would be great, for example... But I can't find a way to do something like this.
I am not an expert at MySQL administration.. If you need more information, please let me know !
Thank you !!
30 minutes for million INSERT is not fast. Do you have an index on date column? (or whatever column you are using to pivot on)
Regarding your original question.It's difficult to say much without knowing the details of both your scripts and the table structures, but one possible reason why the scripts are running reasonably quickly separately is because you are doing similar kinds of SELECT queries, which might be getting cached by MySQL and then reused for subsequent queries. But if you are running two queries in parallel, then the SELECT's for the corresponding query might not stay in the cache (because there are two concurrent processes which send new queries all the time).
You might want to explicitly disable cache for some queries which you are sure you only run once (using SQL_NO_CACHE modifier) and see if it changes anything. But I'd look into indexing and into your table structure first, because 30 minutes seems to be extremely slow :) E.g. you might also want to introduce partitioning by date for your tables, if you know that you always choose entries in a given period (say by month). The exact tricks depend on your data.
UPDATE: Another issue might be that both your queries work with the same table (table 1), and the default transaction isolation level in MySQL is REPEATABLE READS afair. So it might be that one query is waiting until the other is done with the table to satisfy the transaction isolation level. You might want to lower the transaction isolation level if you are sure that your table 1 is not changed when scripts are working on it.
You can use an event scheduler so you can set mysql to launch this queries at different hours of the day, in another stackoverflow related question you have an exmaple of how to do it: MySQL Event Scheduler on a specific time everyday
Another thing to have in mind is to use the explain plan to see what could be the reason the query is that slow.

How to select large data from SQL Server 2008 R2?

I have run into problem with selecting large data from SQL Server. I have a view with 200 columns and 200 000 rows. And I am using the view for Solr indexing.
I tried to select data with paging but it took a lot of time(more then 6 hours). Now i am selecting it without paging and it take 1 hour. But SQL Server takes a lot of memory.
What is the best method or approach to select large data in such situations from SQL Server 2008 R2?
Thanks in advance.
200k rows is not that much and definitely shouldn't take 6 hours, not even 1 hour.
I did not understant if the problem is on the actuall select or bringing the result to the application.
I would recomend running the select with NOLOCK to ignore blockings, maybe your table is beeing accessed by other processes when you are running the query
SELECT * FROM TABLE WITH(NOLOCK)
If the problem is on bringing the data to the application you'll need to provide mroe details on how you are doing it
I'd suggest taking a look at your execution plan. Look at the properties in the very first operator and see if you're getting "Good Enough Plan Found" or a timeout. If it's the latter, you may have a very complicated view or you may be nesting views (a view calling a view). See what you can do to simplify the query in order to give the optimizer a chance to create a good execution plan.

Access 2007 First cn.Execute Statement very slow

I am currently working on an application in Access 2007 with a split FE and BE. FE is local wiht BE on a network share. To eliminate some of the issues found with using linked tables over a network, I am attempting, through VBA using ADO, to load two temp tables with data from two linked when the application first loads using the cn.Execute "INSERT INTO TempTable1 SELECT * FROM LinkedTable1" and cn.Execute "INSERT INTO TempTable2 SELECT * FROM LinkedTable2".
LinkedTable1 has 45,552 records in it and LinkedTable2 has 45,697 records in it.
The first execute statement takes anwhere from 50-85seconds. However the second execute statement takes no more than 9 seconds. These times are consistent. In an attempt to see if there were issues with one of the tables and not the other, I have switched the order of the statements in my code and the times still come out the same (first execute is way too long and second execute is very fast). (As a side note, I have also tried DAO using the CurrentDB.Execute command with no different results.) This would make sense to me if the first statement was processing more records than the second, but although a small number, the second table has more records than the first!
Does anyone have ANY suggestions on why this is happening and/or how to get this first execute statement to speed up?
Thanks in advance!
ww
What indexes do you have defined on the two temp tables, as well as primary key definitions? Updating the indexes as the data is appended could be one reason one table is slower.
My guess is that there are two sources for the difference:
the initial creation of the remote LDB file when you execute the first INSERT statement. This shows up as overhead in the first SQL command, when it's actually something that persists through both.
caching: likely the file is small enough that Jet/ACE is pulling large chunks of it across the wire (the header and metadata, plus the requested data pages) during the first operation so that there's much less data that is not already in local memory when the second command is issued.
My question is why you are having problems with linked table performance in the first place. Solve that and you then won't have to muck about with temp tables. See Tony Toews's Performance FAQ.