Access - Form Value Usage Optimization - ms-access

Say I use a value, ME!txtUsername.value throughout an event, and pass it through to many functions.
Is it more efficient or best practice to:
A) Set The Value Into A Variable
DIM username as string
username = ME!txtUsername.value
OR
B) Use It Explicitly Through The Event
DIM username as string
iAmAFunction(ME!txtUsername.value)
OR
C) There is a negligible difference and it is simply preference?

I think the faster way for retrieving the same value multiple times would be to assign it to a variable the first time and retrieve it from the variable thereafter.
But, I suspect you would need a fairly extreme border case to actually notice a difference. So I'll say the correct answer is C - negligible difference.
Personally, I wouldn't be concerned about a performance difference with this. I would likely prefer to repeatedly type and read strUsername instead of ME!txtUsername.value
My gut reaction is this is a micro-optimization which is seldom worth worrying about.

Related

Should I sanitize inputs to a parametrized query?

I have a couple of basic questions on parametrized queries
Consider this code:
$id = (int)$_GET['id'];
mysql_query("UPDATE table SET field=1 WHERE id=".$id);
Now the same thing using a parametrized query
$sql = "UPDATE table SET field=1 WHERE id=?";
$q = $db->prepare($sql);
$q->execute(array($_GET['id']));
My questions are:
is there any situation where the first code (i.e. with the (int) cast) is unsafe?
is the second piece of code OK or should I also cast $_GET['id'] to int?
is there any known vulnerability of the second piece of code? That is, is there any way an SQL attack can be made if I am using the second query?
is there any situation where the first code (i.e. with the (int) cast) is unsafe?
I'm not a PHP expert, but I think there shouldn't be. That's not to say that PHP doesn't have bugs (either known or yet to be discovered) that could be exploited here.
is the second piece of code OK or should I also cast $_GET['id'] to int?
Likewise, the second piece of code should be absolutely fine - even if the data type was a string, MySQL would know not to evaluate it for SQL as it's a parameter and therefore only to be treated as a literal value. However, there's certainly no harm in also performing the cast (which would avoid any flaws in MySQL's handling of parameters) - I'd recommend doing both.
EDIT - #Tomalak makes a very good point about cast potentially resulting in incorrect data and suggests first verifying your inputs with sanity checks such as is_numeric(); I agree wholeheartedly.
is there any known vulnerability of the second piece of code? That is, is there any way an SQL attack can be made if I am using the second query?
Not to my knowledge.
(int) will yield 0 when the conversion fails. This could lead to updating the wrong record. Besides, it's sloppy and an open invitation to "forget" proper type casting when the query gets more complex later-on.
It's safe in its current form (against SQL injection, not against updating the wrong record) but I'd still not recommend it. Once the query gets more complex you're bound to use prepared statements anyway, so just do it right from the start - also for the sake of consistency.
That's sloppy, too. The parameter will be transferred to the DB as a string and the DB will try to cast it. It's safe (against SQL injection), but unless you know exactly how the DB server reacts when you pass invalid data, you should sanitize the value up-front (is_numeric() and casting).
No. (Unless there is a bug in PDO, that is.)
As a rule of thumb:
Don't pass unchecked data to the database and expect the right thing to happen.
Don't knowingly pass invalid data and trust that the other system reacts in a certain way. Do sanity checks and error handling yourself.
Don't make "Oh, that converts to 0 and I don't have a record with ID 0 anyway so that's okay." part of your thought process.

"Min in Select statement" or DMin(). Which one is preferable?

I needed to find minimum revenue from a table tbl_Revenue. I found out two methods to do that:
Method 1
Dim MinRevenueSQL As String
Dim rsMinRev As DAO.Recordset
MinRevenueSQL = "SELECT Min(tbl_Revenue.Revenue_Value)As MinRevenue FROM tbl_Revenue WHERE (((tbl_Revenue.Division_ID)=20) AND ((tbl_Revenue.Period_Type)='Annual'));"
Set rsMinRev = CurrentDb.OpenRecordset(MinRevenueSQL)
MinRev = rsMinRev!MinRevenue
Method 2
MinRev2 = DMin("Revenue_Value", "tbl_Revenue", "(((tbl_Revenue.Division_ID)=20) AND ((tbl_Revenue.Period_Type)='Annual'))")
I have following questions:
which one of them is computationally more efficient? Is there a lot of difference in computational efficiency if instead of tbl_Revenue table there is a select statment using joins?
Is there a problem with accuracy of DMin fundtion? (By accuracy I mean are there any loopholes that I need to be aware of before using DMin.)
I suspect that the answer may vary depending on your situation.
In a single user situation, #transistor1 testing method will give you a good answer for an isolated lookup.
But on a db that's shared on a network, IF you already Set db = CurrentDb, then the SELECT method should be faster, since it does not require opening a second connection to the db, which is slow.
The same way, it is more efficient to Set db = CurrentDb and reuse that db everywhere.
In situations where I want to make sure I have the best speed, I use Public db as DAO.Database when opening the app. Then in every module where it is required, I use
If db is Nothing Then set db = CurrentDb.
In your specific code, you are running it once so it doesn't make much of a difference. If it's in a loop or a query and you are combining hundreds or thousands of iterations, then you will run into issues.
If performance over thousands of iterations is important to you, I would write something like the following:
Sub runDMin()
x = Timer
For i = 1 To 10000
MinRev2 = DMin("Revenue_Value", "tbl_Revenue", "(((tbl_Revenue.Division_ID)=20) AND ((tbl_Revenue.Period_Type)='Annual'))")
Next
Debug.Print "Total runtime seconds:" & Timer - x
End Sub
Then implement the same for the DAO query, replacing the MinRev2 part. Run them both several times and take an average. Try your best to simulate the conditions it will be run under; for example if you will be changing the parameters within each query, do the same, because that will most likely have an effect on the performance of both methods. I have done something similar with DAO and ADO in Access and was surprised to find out that under my conditions, DAO was running faster (this was a few years ago, so perhaps things have changed since then).
There is definitely a difference when it comes to using DMin in a query to get a minimum from a foreign table. From the Access docs:
Tip: Although you can use the DMin function to find the minimum value
from a field in a foreign table, it may be more efficient to create a
query that contains the fields that you need from both tables, and
base your form or report on that query.
However, this is slightly different than your situation, in which you are running both from a VBA method.
I have tended to believe (maybe erroneously because I don't have any evidence) that the domain functions (DMin, DMax, etc.) are slower than using SQL. Perhaps if you run the code above you could let us know how it turns out.
If you write the DMin call correctly, there are no accuracy issues that I am aware of. Have you heard that there were? Essentially, the call should be: DMin("<Field Name>", "<Table Name>", "<Where Clause>")
Good luck!

Check for equality before assigning?

Is it a good practice to assign a value only if it's not equal to the assignee? For example, would:
bool isVisible = false;
if(TextBox1.Visible != isVisible)
TextBox1.Visible = isVisible;
be more desirable than:
bool isVisible = false;
TextBox1.Visible = isVisible;
Furthermore, does the answer depend on the data type, like an object with a costlier Equals method versus an object with a costlier assignment method?
From a readability standpoint, I'd definitely prefer the second way -- just assign the darn thing.
Some object properties have semantics that require that assigning a value the property already holds will have a particular effect. For example, setting an object's "text" may force a redraw, even if the value doesn't change. When dealing with such objects, unless one wants to force the action to take place, one should often test and set if unequal.
Generally, with fields, there is no advantage to doing a comparison before a set. There is one notable exception, however: if many concurrently-running threads will be wanting to set a field to the same value, and it's likely to already hold that value, caching behavior may be very bad if all the threads are unconditionally writing to that field, since a processor that wants to write to it will have to acquire the cache line from the most recent processor that wrote it. By contrast, if all the processors are simply reading the field and deciding to leave it alone, they can all share the cache line, resulting in much better performance.
Your instincts seem about right - it depends on the cost of the operations.
In your example, making a text box visible or invisible, the cost of the test is imperceptible (just check a bit in the window structure) and the cost of assignment is also typically imperceptible (repaint the window). In fact, if you set the "visible" bit to its existing value you'll still incur the function call cost, but the window manager will check the bit and return immediately. In this case, just go ahead and assign it.
But in other cases it might matter. For example, if you have a cached copy of a long string or binary object, and whenever you assign a new value it gets saved back to a database. Then you might find that the cost of testing for equality every time is worth it to save unnecessary writes to the database. No doubt you can imagine more expensive scenarios.
So in the general case you've got at least these primary variables to consider: the cost of the test, the cost of the assignment, and the relative frequencies of assigning a new value versus assigning the same value.

Organize address cache

I need to organize cache in mySql database for address - coordinates. What is the best practice to store address? Do i need to compress address string or use it as is?
edit:
Ok, let's I reassert my question.
How to store long (up to 512) string in database if I need to search by exactly this string in future.
If you are absolutely certain your search string can be normalized (e.g.: stripping all the extra spaces, forcing lower case etc.) so to avoid ambiguity and that you need to search for full match (i.e. you either find exactly the normalized string or not, and don't need to search by substring, soundex, partial match, sort by it etc. - this is how I read your "by exactly this string" ) you could consider calculating the hashcode of the string, put it in the DB and indexing that.
If you use an hashcode function that returns a number, you will have a very efficient access index. And of course you can still keep the original string field for printing and different access approaches.
Possible problems: while hashcode can minimize the chance of a hash collision, they cannot be guaranteed not to happen, so you should manage that, too.
Also, unless you have lots and lots of addresses, I doubt that the speedup gain will be worth the trouble.
MySql can manage coordinates and operates on these values, try looking at http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html
If you want something simpler, personnaly I usually store separately the city code, the city name and the rest of the adress string. Then I can index and search on these fields (one by one, or with a combination).
If you want a simple use of coordinates, you can simply store the latitude/longitude and do basic comparisons
Answer can be found here

MySQL is SELECT with LIKE expensive?

The following question is regarding the speed between selecting an exact match (example: INT) vs a "LIKE" match with a varchar.
Is there much difference? The main reason I'm asking this is because I'm trying to decide if it's a good idea to leave IDs out of my current project.
For example Instead of:
http://mysite.com/article/391239/this-is-an-entry
Change to:
http://mysite.com/article/this-is-an-entry
Do you think I'll experience any performance problems on the long run? Should I keep the ID's?
Note:
I would use LIKE to keep it easier for users to remember. For example, if they write "http://mysite.com/article/this-is-an" it would redirect to the correct.
Regarding the number of pages, lets say I'm around 79,230 and the app. is growing fast. Like lets say 1640 entries per day
An INT comparison will be faster than a string (varchar) comparison. A LIKE comparison is even slower as it involves at least one wildcard.
Whether this is significant in your application is hard to tell from what you've told us. Unless it's really intensive, ie. you're doing gazillions of these comparisons, I'd go with clarity for your users.
Another thing to think about: are users always going to type the URL? Or are they simply going to use a search engine? These days I simply search, rather than try and remember a URL. Which would make this a non-issue for me as a user. What are you users like? Can you tell from your application how they access your site?
Firstly I think it doesn't really matter either way, yes it will be slower as a LIKE clause involves more work than a direct comparison, however the speed is negligible on normal sites.
This can be easily tested if you were to measure the time it took to execute your query, there are plenty of examples to help you in this department.
To move away slighty from your question, you have to ask yourself whether you even need to use a LIKE for this query, because 'this-is-an-entry' should be unique, right?
SELECT id, friendly_url, name, content FROM articles WHERE friendly_url = 'this-is-an-article';
A "SELECT * FROM x WHERE = 391239" query is going to be faster than "SELECT * FROM x WHERE = 'some-key'" which in turn is going to be faster than "SELECT * FROM x WHERE LIKE '%some-key%'" (presence of wild-cards isn't going to make a heap of difference.
How much faster? Twice as fast? - quite likely. Ten times as fast? stretching it but possible. The real questions here are 1) does it matter and 2) should you even be using LIKE in the first place.
1) Does it matter
I'd probably say not. If you indeed have 391,239+ unique articles/pages - and assuming you get a comparable level of traffic, then this is probably just one of many scaling problems you are likely to encounter. However, I'd warrant this is not the case, and therefore you shouldn't worry about a million page views until you get to 1 million and one.
2) Should you even be using LIKE
No. If the page/article title/name is part of the URL "slug", it has to be unique. If it's not, then you are shooting yourself in the foot in term of SEO and writing yourself a maintanence nightmare. If the title/name is unique, then you can just use a "WHERE title = 'some-page'", and making sure the title column has a unique index on.
Edit
You plan of using LIKE for the URL's is utterly utterly crazy. What happens if someone visits
yoursite.com/articles/the
Do you return a list of all the pages starting "the" ? What then happens if:
Author A creates
yoursite.com/articles/stackoverflow-is-massive
2 days later Author B creates
yoursite.com/articles/stackoverflow-is-massively-flawed
Not only will A be pretty angry that his article has been hi-jacked, all the perma-links he may have been sent out will be broken, and Google is going never going to give your articles any reasonable page-rank because the content keeps changing and effectively diluting itself.
Sometimes there is a pretty good reason you've never seen your amazing new "idea/feature/invention/time-saver" anywhere else before.
INT is much more faster.
In the string case I think you should not select query with LIKE but just with = because you look for this-is-an-entry, not for this-is-an-entry-and-something.
There are a few things to consider:
The type of search performed on the DataBase will be an "index seek", search for single row using an index, most of the time.
This type of exact match operation on a single row is not significantly faster using ints than strings, they are basically the same cost, for any practical purpose.
What you can do is the following optimization, search the database using a exact match (no wildcards), this is as fast as using an int index. If there is no match do a fuzzy search (search using wildcards) this is more expensive, but on the other hand is more rare and can produce more than one result. A form of ranking results is needed if you want to go for best match.
Pseudocode:
Search for an exact match using a string: Article Like 'entry'
if (match is found) display page
if (match is not found) Search using wildcards
If (one apropriate match is found) display page
If (more relevant matches) display a "Did you tried to find ... page"
If (no matches) display error page
Note: keep in mind that fuzzy URLs are not recommended from a SEO perspective, because people can link your site using multiple URLs which will split your page rank instead of increase it.
If you put an index on the varchar field it should be ok (performance wise), really depends on how many pages you are going to have. Also you have to be more careful and sanitize the string to prevent sql injections, e.g. only allow a-z, 0-9, -, _, etc in your query.
I would still prefer an integer id as it is faster and safer, change the format to something nicer like:
http://mysite.com/article/21-this-is-an-entry.html
As said, comparing INT < VARCHAR, and if the table is indexed on the field you're searching then that'll help too, as the server won't have to create a manual index on the fly.
One thing which will help validate your queries for speed and sense is EXPLAIN. You can use this to show which indexes your query is using, as well as execution times.
To answer your question, if it's possible to build your system using exact matches on the article ID (ie an INT) then it'll be much "lighter" than if you're trying to match the whole url using a LIKE statement. LIKE will obviously work, but I wouldn't want to run a large, high traffic site on it.