How to check when a VBA module was modified? - ms-access

I have written a version control module. The AutoExec macro launches it whenever I, or one of the other maintainers log in. It looks for database objects that have been created or modified since the previous update, and then adds an entry to the Versions table and then opens the table (filtered to the last record) so I can type in a summary of the changes I performed.
It is working great for Tables, Queries, Forms, Macros, etc but I cannot get it to work correctly for modules.
I have found two different properties that suggest a Last Modified date ...
CurrentDB.Containers("Modules").Documents("MyModule").Properties("LastUpdated").Value
CurrentProject.AllModules("MyModule").DateModified
The first one (CurrentDB) always shows "LastUpdated" as the Date it was created, unless you modify the description of the module or something in the interface. This tells me that this property is purely for the container object - not what's in it.
The second one works a lot better. It accurately shows the date when I modify and compile/save the module. The only problem is that when you save or compile a module, it saves / compiles ALL the modules again, and therefore sets the DateModified field to the same date across the board. It kind of defeats the purpose of having the DateModified property on the individual modules doesn't it?
So my next course of action is going to a bit more drastic. I am thinking I will need to maintain a list of all the modules, and count the lines of code in each module using VBA Extensions. Then, if the lines of code differs from what the list has recorded - then I know that the module has been modified - I just won't know when, other than "since the last time I checked"
Does anyone have a better approach? I'd rather not do my next course of action because I can see it noticeably affecting database performance (in the bad kind of way)

Here's a simpler suggestion:
Calculate the MD5 hash for each module.
Store it in the Versions table.
Recalculate it for each module during the AutoExec and compare it to the one in the Versions table. If it's different, you can assume it has been changed (while MD5 is bad for security, it's still solid for integrity).
To get the text from a module using VBE Extensibility, you can do
Dim oMod As CodeModule
Dim strMod As String
Set oMod = VBE.ActiveVBProject.VBComponents(1).CodeModule
strMod = oMod.Lines(1, oMod.CountOfLines)
And then you can use the following modified MD5 hash function from this answer as below, you can take the hash of each module to store it, then compare it in your AutoExec.
Public Function StringToMD5Hex(s As String) As String
Dim enc
Dim bytes() As Byte
Dim outstr As String
Dim pos As Integer
Set enc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
'Convert the string to a byte array and hash it
bytes = StrConv(s, vbFromUnicode)
bytes = enc.ComputeHash_2((bytes))
'Convert the byte array to a hex string
For pos = 0 To UBound(bytes)
outstr = outstr & LCase(Right("0" & Hex(bytes(pos)), 2))
Next
StringToMD5Hex = outstr
Set enc = Nothing
End Function

You can't know when a module was modified. The VBIDE API doesn't even tell you whether a module was modified, so you have to figure that out yourself.
The VBIDE API makes it excruciatingly painful - as you've noticed.
Rubberduck doesn't deal with host-specific components yet (e.g. tables, queries, etc.), but its parser does a pretty good job at telling whether a module was modified since the last parse.
"Modified since last time I checked" is really all you need to know. You can't rely on line counts though, because this:
Option Explicit
Sub DoSomething
'todo: implement
End Sub
Would be the same as this:
Option Explicit
Sub DoSomething
DoSomethingElse 42
End Sub
And obviously you'd want that change to be picked up and tracked. Comparing every character on every single line of code would work, but there's a much faster way.
The general idea is to grab a CodeModule's contents, hash it, and then compare against the previous content hash - if anything was modified, we're looking at a "dirty" module. It's C#, and I don't know if there's a COM library that can readily hash a string from VBA, but worst-case you could compile a little utility DLL in .NET that exposes a COM-visible function that takes a String and returns a hash for it, shouldn't be too complicated.
Here's the relevant code from Rubberduck.VBEditor.SafeComWrappers.VBA.CodeModule, if it's any help:
private string _previousContentHash;
public string ContentHash()
{
using (var hash = new SHA256Managed())
using (var stream = Content().ToStream())
{
return _previousContentHash = new string(Encoding.Unicode.GetChars(hash.ComputeHash(stream)));
}
}
public string Content()
{
return Target.CountOfLines == 0 ? string.Empty : GetLines(1, CountOfLines);
}
public string GetLines(Selection selection)
{
return GetLines(selection.StartLine, selection.LineCount);
}
public string GetLines(int startLine, int count)
{
return Target.get_Lines(startLine, count);
}
Here Target is a Microsoft.Vbe.Interop.CodeModule object - if you're in VBA land then that's simply a CodeModule, from the VBA Extensibility library; something like this:
Public Function IsModified(ByVal target As CodeModule, ByVal previousHash As String) As Boolean
Dim content As String
If target.CountOfLines = 0 Then
content = vbNullString
Else
content = target.GetLines(1, target.CountOfLines)
End If
Dim hash As String
hash = MyHashingLibrary.MyHashingFunction(content)
IsModified = (hash <> previousHash)
End Function
So yeah, your "drastic" solution is pretty much the only reliable way to go about it. Few things to keep in mind:
"Keeping a list of all modules" will work, but if you only store module names, and a module was renamed, your cache is stale and you need a way to invalidate it.
If you store the ObjPtr of each module object rather than their names, I'm not sure if it's reliable in VBA, but I can tell you that through COM interop, a COM object's hashcode isn't going to be consistently consistent between calls - so you'll have a stale cache and a way to invalidate it, that way too. Possibly not an issue with a 100% VBA solution though.
I'd go with a Dictionary that stores the modules' object pointer as a key, and their content hash as a value.
That said as the administrator of the Rubberduck project, I'd much rather see you join us and help us integrate full-featured source control (i.e. with host-specific features) directly into the VBE =)

I thought I would add the final code I came up with for a hash / checksum generation module, since that was really the piece I was missing. Credit to the #BlackHawk answer for filling in the gap by showing that you can late bind .NET classes - that's going to open up a lot of possibilities for me now.
I have finished writing my Version checker. There were a few caveats that I encountered that made it hard to rely on the LastUpdated date.
Resizing the columns in a Table or Query changed the LastUpdated date.
Compiling any Module compiled all modules, thus updated all module's LastUpdated date (as was already pointed out)
Adding a filter to a form in View mode causes the form's Filter field to be updated,which in turn updates the LastUpdated date.
When using SaveAsText on a Form or Report, changing a printer or display driver can affect the PrtDevMode encodings, so it is necessary to strip them out before calculating a checksum
For Tables I built a string that was a concatenation of the table name, all field names with their size and data types. I then computed the hash on those.
For Queries I simply computed the hash on the SQL.
For Modules, Macros, Forms, and Reports I used the Application.SaveAsText to save it to a temporary file. I then read that file in to a string and computed a hash on it. For Forms and Reports I didn't start adding to the string until the "Begin" line passed.
Seems to be working now and I haven't come across any situations where it would prompt for a version revision when something wasn't actually changed.
For calculating a checksum or hash, I built a Class Module named CryptoHash. Here is the full source below. I optimized the Bytes Array to Hex String conversion to be quicker.
Option Compare Database
Option Explicit
Private objProvider As Object ' Late Bound object variable for MD5 Provider
Private objEncoder As Object ' Late Bound object variable for Text Encoder
Private strArrHex(255) As String ' Hexadecimal lookup table array
Public Enum hashServiceProviders
MD5
SHA1
SHA256
SHA384
SHA512
End Enum
Private Sub Class_Initialize()
Const C_HEX = "0123456789ABCDEF"
Dim intIdx As Integer ' Our Array Index Iteration variable
' Instantiate our two .NET class objects
Set objEncoder = CreateObject("System.Text.UTF8Encoding")
Set objProvider = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
' Initialize our Lookup Table (array)
For intIdx = 0 To 255
' A byte is represented within two hexadecimal digits.
' When divided by 16, the whole number is the first hex character
' the remainder is the second hex character
' Populate our Lookup table (array)
strArrHex(intIdx) = Mid(C_HEX, (intIdx \ 16) + 1, 1) & Mid(C_HEX, (intIdx Mod 16) + 1, 1)
Next
End Sub
Private Sub Class_Terminate()
' Explicity remove the references to our objects so Access can free memory
Set objProvider = Nothing
Set objEncoder = Nothing
End Sub
Public Property Let Provider(NewProvider As hashServiceProviders)
' Switch our Cryptographic hash provider
Select Case NewProvider
Case MD5:
Set objProvider = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
Case SHA1:
Set objProvider = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
Case SHA256:
Set objProvider = CreateObject("System.Security.Cryptography.SHA256Managed")
Case SHA384:
Set objProvider = CreateObject("System.Security.Cryptography.SHA384Managed")
Case SHA512:
Set objProvider = CreateObject("System.Security.Cryptography.SHA512Managed")
Case Else:
Err.Raise vbObjectError + 2029, "CryptoHash::Provider", "Invalid Provider Specified"
End Select
End Property
' Converts an array of bytes into a hexadecimal string
Private Function Hash_BytesToHex(bytArr() As Byte) As String
Dim lngArrayUBound As Long ' The Upper Bound limit of our byte array
Dim intIdx As Long ' Our Array Index Iteration variable
' Not sure if VBA re-evaluates the loop terminator with every iteration or not
' When speed matters, I usually put it in its own variable just to be safe
lngArrayUBound = UBound(bytArr)
' For each element in our byte array, add a character to the return value
For intIdx = 0 To lngArrayUBound
Hash_BytesToHex = Hash_BytesToHex & strArrHex(bytArr(intIdx))
Next
End Function
' Computes a Hash on the supplied string
Public Function Compute(SourceString As String) As String
Dim BytArrData() As Byte ' Byte Array produced from our SourceString
Dim BytArrHash() As Byte ' Byte Array returned from our MD5 Provider
' Note:
' Because some languages (including VBA) do not support method overloading,
' the COM system uses "name mangling" in order to allow the proper method
' to be called. This name mangling appends a number at the end of the function.
' You can check the MSDN documentation to see how many overloaded variations exist
' Convert our Source String into an array of bytes.
BytArrData = objEncoder.GetBytes_4(SourceString)
' Compute the MD5 hash and store in an array of bytes
BytArrHash = objProvider.ComputeHash_2(BytArrData)
' Convert our Bytes into a hexadecimal representation
Compute = Hash_BytesToHex(BytArrHash)
' Free up our dynamic array memory
Erase BytArrData
Erase BytArrHash
End Function

Related

Octave: how to retrieve data from a Java ResultSet object?

I need to feed my Octave instance with data retrieved from an Oracle database.
I have implemented an OJDBC connection in my Octave instance an I am able now to put data from an Oracle database into a Java ResultSet object in Octave (taken from: https://lists.gnu.org/archive/html/help-octave/2011-08/msg00250.html):
javaaddpath('access-path-to-ojdbc8.jar') ;
props = javaObject('java.util.Properties') ;
props.setProperty("user", 'username') ;
props.setProperty("password", 'password') ;
driver = javaObject('oracle.jdbc.OracleDriver') ;
url = 'jdbc:oracle:thin:#ip:port:schema' ;
con = driver.connect(url, props) ;
sql = 'select-query' ;
ps = con.prepareStatement(sql) ;
rs = ps.executeQuery() ;
But haven't succeeded with retrieving data from that ResultSet.
How can I put data from a ResultSet object in Octave into an array or matrix?
Finding out what to do
The docs you want for ResultSet and related classes are in the Java JDBC API documentation. (You don't need the Oracle-specific doco unless you want to do fancy Oracle-specific stuff. All JDBC drivers conform to the generic JDBC API.) Have a look at that and any JDBC tutorial; because it is a Java object, you'll use all the same method calls from Octave that you would from Java code.
For conversion to Octave values, know that Java primitives convert to Octave types automatically, java.lang.String objects require conversion by calling char(...) on them, and java.sql.Date values you will have to convert to datenums manually. (Lazy way is to get their string values and parse them; fast way is to get their Unix time values and convert numerically.)
What to do
Because Java JDBC advances the result set cursor one row at a time, and requires a separate method call to get the value for each column, you need to use a pair of nested loops to iterate over the ResultSet. Like this:
rsMeta = rs.getMetaData();
nCols = rsMeta.getColumnCount();
data = NaN(1, nCols);
iRow = 0;
while rs.next()
iRow = iRow + 1;
for iCol = 1:nCols
data(iRow,iCol) = rs.getDouble(iCol);
endfor
endwhile
Ah, but what if your columns aren't all numerics? Then you'll need to look at the column type in rsMeta, switch on it, and use a cell array to hold the heterogeneous data set. Like this:
rsMeta = rs.getMetaData();
nCols = rsMeta.getColumnCount();
data = cell(1, nCols);
iRow = 0;
while rs.next()
iRow = iRow + 1;
for iCol = 1:nCols
colTypeId = rsMeta.getColumnType(iCol);
switch colTypeId
case NUMERIC_TYPE
data{iRow,iCol} = rs.getDouble(iCol);
case CHAR_TYPE
data{iRow,iCol} = rs.getString(iCol);
data{iRow,iCol} = char(data{iRow,iCol});
# ... and so on ...
otherwise
error('Unsupported SQL data type in column %d: %d', ...
iCol, colTypeId);
endswitch
endfor
endwhile
How do you know what the values for NUMERIC_TYPE, CHAR_TYPE, and so on should be? You have to examine the values in the java.sql.Types Java class. Do that at run time to make sure you're consistent with the JDK you're running against.
(Note: this code is the easy, sloppy way of doing it. There's all sorts of improvements and optimizations you could (and should) do on it.)
How to go fast
Unfortunately, the performance of this is going to suck big time, because Java method calls from Octave are expensive, and cells are in inefficient way of holding data. If your result sets are large, in order to get good performance, what you need to do is write a result set buffering layer in Java that runs the loops in Java and buffers the results in primitive per-column arrays, and use that. If you want an example of how to do this, I have an example implementation in Matlab in my Janklab library (M-code layer here). Feel free to steal the code. Octave doesn't support dot-referencing of Java constructors or class methods, so to convert it to Octave, you'd need to replace all those with javaObject and javaMethod calls. (That's tedious and results in ugly code, so I'm not going to do it myself. Sorry.)
If you're not willing to do that (and really, who is?), and still need good performance, what you should actually do is forget about connecting Octave directly to Oracle, and write a separate Python/NumPy or R program that takes your query, runs it against your Oracle db, and writes the result to a .mat file that you will then read from Octave.
I don't have access to the specified .jar or a suitable database to test your specific code, but in any case, this isn't really a problem of octave. Effectively you need the relevant api for the ResultSet class, and a standard approach for processing it. The oracle documentation suggests that in java you'd do something like this:
while (rs.next()) { System.out.println (rs.getString(1)); }
So, presumably this is exactly what you'll do in octave too, except via octave's java interface. One possible way this might look like is
while rs.next().booleanValue % since a Boolean java object by itself
% isn't valid logical input for octave's
% 'while' statement
% do something with rs, e.g. fill in a cell array
endwhile
As for whether you can automatically convert a java array to an octave cell-object or vice-versa, as far as I know this is not possible. You'd have to set / get elements from one to the other via a for loop, just like you'd do in java (e.g. see the note in the manual regarding the javaArray function)

Late Binding Global Variables?

I'm using VBA for Excel. From my understanding, global variables need to be declared outside of any subs. That's the only way they can be accessed by all subs.
At the meantime, I want to do late binding to reference the "Microsoft Scripting Runtime" library(in order to use the dictionary object type) so that an end user doesn't have to do it himself.
My code is as below:
On Error Resume Next
strGUID = "{420B2830-E718-11CF-893D-00A0C9054228}"
ThisWorkbook.VBProject.References.AddFromGuid GUID:=strGUID, Major:=1, Minor:=0
Dim Dic1 As Object
Set Dic1 = CreateObject("Scripting.Dictionary")
Dim Dic2 As Object
Set Dic2 = CreateObject("Scripting.Dictionary")
What if I want to declare global dictionary object with late binding? It looks like VBA won't allow me to put any code outside of the subs (other than the declarations).
How may I declare a global dictionary object without needing the end user configure library reference himself? Shall I don the following?
Dim Dic1 As Object
Dim Dic2 As Object
Sub Prog1()
On Error Resume Next
strGUID = "{420B2830-E718-11CF-893D-00A0C9054228}"
ThisWorkbook.VBProject.References.AddFromGuid GUID:=strGUID, Major:=1, Minor:=0
Set Dic1 = CreateObject("Scripting.Dictionary")
Set Dic2 = CreateObject("Scripting.Dictionary")
End Sub
Like the VBA code itself, project references don't magically disappear when your user opens your host workbook. They're saved along with the code in the host document.
So, the premise of your question is wrong: users never need to tweak project references.
Also the Scripting Runtime type library is standard issue and has been shipped the exact same version on every single Windows machine built this century (even before that), which means unless your code needs to run on a Mac, there's no need to ever late-bind the Scripting Runtime library.
And if your code needs to run on a Mac, the library won't late-bind anyway because it won't be found on the host machine, so late-binding the Scripting Runtime only serves to make silly typos and introduce other easily avoidable bugs that IntelliSense helps preventing.
ThisWorkbook.VBProject.References.AddFromGuid GUID:=strGUID, Major:=1, Minor:=0
This defeats the entire purpose of late-binding: it's using the VBIDE extensibility library (which requires lowered macro security settings) to programmatically add a reference that you can easily add at design-time through the VBE's Tools menu.
Late-bound code doesn't need the reference at all. Not at compile-time, not at run-time.
Add the reference, save, then declare your objects As Scripting.Dictionary and enjoy the benefits of early-bound code.
Set Dic1 = New Scripting.Dictionary
That's all you need.
What if I want to declare global dictionary object with late binding? It looks like VBA won't allow me to put any code outside of the subs (other than the declarations).
Late binding isn't any different than early binding in that aspect. The only difference between late and early bound code is the As clause of the declaration:
Private foo As Object ' no compile-time type knowledge: late-bound
Private bar As Dictionary ' compile-time type knowledge: early-bound
How you're initializing that object reference makes no difference to the late/early binding nature of the declaration.
This looks up a ProgID in the registry to find the library and the type:
Set foo = CreateObject("Scripting.Dictionary")
This uses the project references:
Set foo = New Scripting.Dictionary
Both are correct, and both will work against either early or late-bound declarations. Except, if you already have a reference to the type library, there's not really a need to go hit the registry to locate that library - just New it up!
Global variables are really not needed and should be avoided. However, if you have decided to use them for a your own reasons, you can put them in the Workbook_Open event:
Option Explicit
Dim Dic1 As Object
Dim Dic2 As Object
Private Sub Workbook_Open()
Set Dic1 = CreateObject("Scripting.Dictionary")
Set Dic2 = CreateObject("Scripting.Dictionary")
End Sub
Thus, it would create the object every time the workbook is opened.

Releasing Objects in Functions

I have read countless times that you should always release objects at the end of your projects, such as:
Sub Test()
Dim obj As Object
Set obj = GetObject(, "xxxxxx.Application")
' Code your project...
Set obj = Nothing
End Sub
However, I use a program that calls upon Excel's object. I call Excel's object so often in so many different routines, I decided to make a function to make things simpler:
Public Function appXL As Object
Set appXL = GetObject(, "Excel.Application")
End Function
Although that is an extremely simple function, there is no way for me to release the object appXL at the end of any subroutine I use, at least as far as I am aware. But a bigger concern that I have is that a subroutine that uses appXL 100+ times is actually grabbing the object the same number of times.
I honestly don't see why it would be a big issue since I am using GetObject as opposed to CreateObject. I am not creating a new instance of Excel everytime the function is called, so is this something that I should be concerned about when coding? Do I just need to completely get rid of the function and declare appXL on every routine I use so I am able to release it in the end?

An object as both an array and a variable?

I inherited this old TurboBasic code base, and I am converting it to something more modern.
Can you explain how in this code snippet Wind can be both a variable and an array?
Dim Wind(1:3,2:3)
Sub WindFunction
Shared Wind()
local var
Erase Wind
Wind = 123
var = Wind
Wind(1,2) = 567
End Sub
The wikipedia page on Turbo Basic suggests that it is one of the dialects where
A ... double
A$ ... string
A(...) ... array of double
are treated as totally separate variables, so in your case you have
Wind(...) ... an array of double
Wind ... a double
These dialects treat most variables' types just by their name. Only arrays need to be declared. Sometimes even arrays can be addressed without declaration, they are then assumed to be an array with one dimension and a size of 10.
Some more links can be found here on SO (oh, just saw it's by you, too *g*):
https://stackoverflow.com/questions/4147605/learning-turbobasic

Passing array byref doesn't edit original array

I'm trying to write a subroutine in access 2003 that removes all quote characters from strings in an array. The subroutine removes the quotes successfully in the routine itself, but not when the program returns to the passing function. I'm very confused, as it is done ByRef.
How it is called:
Call removeQuotes(wbs_numbers())
and the subroutine itself:
'goes through a string array and removes quotes from each element in the array'
Sub removeQuotes(ByRef string_array() As String)
For Each element In string_array()
'chr(34) is quotation character. visual basic does not have escape characters.'
element = Replace$(element, Chr(34), "")
Next
End Sub
Can someone please explain what I am doing wrong? I would love you forever!
Your array may be by reference, but element isn't. Iterate by index, and set the string back into the array once you've manipulated it.
You are creating new variable "element" and do not store it back in string_array, so it is not changing.
My VB is a little rusty but a quick google search turned up something like this:
Dim i As Integer
For i = LBound(string_array) To UBound(string_array)
string_array(i) = Replace$(string_array(i), Chr(34), "")
Next