I need a very simple hash-function and based on some experiments with excel just a sum of byte values should do it:
Function HashPart(strVal As String) As Long
' work with byte representation for speed
Dim b() As Byte
b = strVal
Dim result As Long
result = 0
For i = 0 To UBound(b)
result = result + b(i)
Next
Quersumme = result
End Function
This is done many time over all records (about 100) resulting from a query:
Set rs = db.OpenRecordset(strSQL)
' Loop through records
Do While Not rs.EOF
resultHash = resultHash + HashPart(rs(0))
resultLen = resultLen + Len(rs(0))
rs.MoveNext
Loop
rs.Close
MyHash = Str(resultLen) & "-" & Str(resultHash)
This works well enough, but is very slow. My previous version iterating over the String using Mid was even slower, but now I am out of ideas how to improve this.
Is there a way to speed this up?
Edit: the problem wasn't in the hash function but in the query.
Test code with constant strings showed that the function itself is very fast. 10,000 calls with strings of ca. 110 characters take only 0.04 seconds.
Conclusion: the performance problem was in the query, not the hash function.
Function HashPart(strVal As String) As Long
' work with byte representation for speed
Dim b() As Byte
Dim result As Long
Dim i As Long
b = strVal
result = 0
For i = 0 To UBound(b)
result = result + b(i)
Next
HashPart = result
End Function
Sub TestHashPart()
Const NumRounds = 10000
Dim i As Long
Dim res As Long
Dim SumRes As Double ' avoid limitation of Long (2^31)
Dim S As String
Dim t1 As Single
t1 = Timer
For i = 1 To NumRounds
' constant string with tiny variations
S = "abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ " & CStr(i ^ 2)
res = HashPart(S)
' This would slow down the process dramatically. DO NOT activate for NumRounds > 1000 !
' Debug.Print i, res, Len(S), S
SumRes = SumRes + res
Next i
Debug.Print SumRes, Timer - t1 & " seconds"
End Sub
Function HashPart(strVal As String) As Long
' work with byte representation for speed
Dim b() As Byte
b = strVal
For i = 0 To UBound(b)
HashPart = HashPart + b(i)
Next
End Function
There's not much to improve, I think if you don't put the additional variable in there and don't set a number to 0 that defaults to 0 you're very slightly better off.
Related
Newbie here. I have an HTML source code and would like to look for string in between header tags <h1></h1>, <h2></h2>till <h5></h5> and then convert the text to lower case except acronyms or abbreviations (these are all capitals in 2 or more characters). And make sure that all country names in between use proper case.
As an example: It will find <h1>HR Policies and Procedures for Hiring - argentina LTD</h1>
It will convert it to:<H1>HR policies and procedures for hiring - Argentina LTD</H1>
I've tried a user defined function for Excel VBA found online: CapIt(A2). It uses Search, Split and Join. I'm not able to put them together to come up with the result. Would appreciate very much your help. Thank you.
Code I saw online as initial reference:
Function Capit(s As String)
Dim v As Variant, j As Long
v = Split(s, " ") ' separates the words
For j = LBound(v) To UBound(v)
If StrComp(v(j), UCase(v(j)), vbBinaryCompare) <> 0 Then v(j) = StrConv(v(j), vbProperCase)
Next j
Capit = Join(v, " ") ' joins the words
End Function
'Added this code below, can we use the results to lowercase the string and exclude the output in this function
Function FindAcronyms(yourWord As String)
Dim I As Integer
Dim ctr As Integer
FindAcronyms = Null
For I = 1 To Len(yourWord)
If Asc(Mid(yourWord, I, 1)) <= 90 And _
Asc(Mid(yourWord, I, 1)) >= 65 Then
If ctr > 0 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1)
End If
ctr = ctr + 1
Else
If ctr > 1 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1) & ", "
End If
ctr = 0
End If
Next
If ctr > 1 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1)
End If
If Right(FindAcronyms, 2) = ", " Then
FindAcronyms = Left(FindAcronyms, Len(FindAcronyms) - 2)
End If
End Function
'the final look would be something like this
Sub TitleChange()
'define array
myarray = Range("A1:A100")
' Define the pattern
Dim pattern As String: pattern = "<h*>*</h*>" 'looks for the header tags
Dim f As Variant
For Each f In myarray
If f Like pattern = True Then Capital (f) 'changes all string to lower case except countries (to retain proper case) and acronyms (to retain uppercase)
Next f
End Sub
You can include the countries in an array
Sub Test()
Debug.Print Capital("HR Policies and Procedures for Hiring - argentina LTD")
End Sub
Function Capital(ByVal s As String)
Dim a, v As Variant, j As Long
a = Array("Argentina", "Egypt", "Enland")
v = Split(s, " ")
For j = LBound(v) To UBound(v)
If StrComp(v(j), UCase(v(j)), vbBinaryCompare) <> 0 Then v(j) = StrConv(v(j), vbLowerCase)
If Not IsError(Application.Match(v(j), a, 0)) Then v(j) = StrConv(v(j), vbProperCase)
Next j
Capital = Join(v, " ")
End Function
Added UDF that parses HTML code, used the Sub Test above as UDF Capital and UDF to bring together. Welcome suggestions to make it cleaner or more efficient
Dim rng As Range, cell As Range
Set rng = Range("A1:A5")
' Define the pattern
Dim pattern As String: pattern = "*<h?>*</h?>*"
' Check each item against the pattern
For Each cell In rng
If (cell Like pattern = True) Then
cell.Offset(0, 16).Value = cell.Value
cell.Offset(0, 16).Value = joinCell(Capital(StripHTML(cell)), cell.Offset(0, 0).Value) 'used UDF for striping innertext, applying rules and joining back string
End If
Next cell
End Sub
I have Access data I'm trying to export to a Word table. The table has 3 columns, the first row and first column are all headers.
I'm trying to loop through the recordset and populate columns 2 & 3 with data. I'm able to start at row 2 and populate columns 2 and 3, but I cannot figure out how to move to the next row.
iTbl = 1
irow = 2
iCol = 1
Do Until recSet2.EOF
If irow > wDoc.Tables(iTbl).Rows.Count Then
wDoc.Tables(iTbl).Rows.Add
End If
For Each fld In recSet2.Fields
On Error Resume Next
iCol = iCol + 1
wDoc.Tables(iTbl).Cell(irow, iCol).Range.Text = Nz(fld.Value)
Next fld
recSet2.MoveNext
irow = irow + 1
iCol = 1
Loop
The best way to create a table in Word, especially one with a lot of data, is to first write the data into a character-delimited string format. Assign the string to a Range in Word, then use the ConvertToTable method to turn it into a table. That will save a lot of trouble with manipulating the object model and is the most efficient approach (fastest in execution).
The following code demonstrates this principle. The procedure Test creates a new instance of Word, creates a new document in the Word application then assigns the character-delimited string to the document content. This is then turned into a table. If you need to format that table, use the tbl object to do so. The way this code is written requires a reference to the Word object library (early binding). Note that it's also possible to use late-binding - you'll find loads of examples for that.
The second procedure, concatData is called in Test to create the character delimited string. It uses a Tab character as the field separator and a carriage return as the record separator. Word will accept pretty much anything as the field separator; the record separator must be a carriage return (ANSI 13).
Sub Test()
Dim wd As Word.Application
Dim doc As Word.Document
Dim rng As Word.Range
Dim tbl As Word.Table
Set wd = New Word.Application
wd.Visible = True
Set doc = wd.Documents.Add
Set rng = doc.Content
rng.Text = concatData()
Set tbl = rng.ConvertToTable
End Sub
Public Function concatData() As String
Dim retVal As String
Dim rsHeader As Long, rsCounter As Long
Dim rs As Recordset
Set rs = CurrentDb.OpenRecordset("nameOfRecordset", dbOpenDynaset)
'Get headers
For rsHeader = 0 To rs.Fields.Count - 1
retVal = retVal & rs.Fields(rsHeader).Name & vbTab
Next
'Replace last TAb with a carriage return
retVal = Left(retVal, Len(retVal) - 1) & vbCr
Do While Not rs.EOF
'Get all records
For rsCounter = 0 To rs.Fields.Count - 1
retVal = retVal & rs.Fields(rsCounter).Value & vbTab
Next
retVal = Left(retVal, Len(retVal) - 1) & vbCr
rs.MoveNext
Loop
concatData = retVal
End Function
Thanks for all the help guys. I managed to figure it out and works very well. It wouldn't move down to the next row and was attempting to write data to column(4) which doesn't exist, then throwing an error. Here is the code I used:
iTbl = 1
iRow = 2
iCol = 1
For Each fld In recSet2.Fields
iCol = iCol + 1
If iCol < 4 Then
wDoc.Tables(iTbl).Cell(iRow, iCol).Range.Text = Nz(fld.value)
Else
If iCol > 3 Then
iCol = iCol - 2
iRow = iRow + 1
wDoc.Tables(iTbl).Cell(iRow, iCol).Range.Text = Nz(fld.value)
End If
End If
Next fld
I developed some code for an Access Database that manipulates a string with a statement like:
myString = Left(myString, somePosition) & Right(myString, someOtherPosition)
the above is part of a loop that has thousands of iterations and the variable myString is thousand of characters long.
I know the above code is bad practice in Java and a StringBuffer should be used instead of a string.
My code is taking a lot of time to run (about 7 minutes) and I suspect the problem might be related to the heavy string manipulation that is going on. Can you please confirm if there is anything similar to StringBuffer in VBA that could improve the efficiency of my code?
Update: full code with StringBuilder
Function SelectColumns2(str As String, columns As String, separator As String) As String
'column_number is the number of the column we are reading when we loop through a line
'z is the counter of the field (a portion of str between two separators)
'i is the counter of the str (the position of the modified string)
Dim column_number As Integer, i As Double, z As Integer, leftPosition As Double
'stringbuilder that stores the string that will represent the final file
Dim sb As StringBuilder, leftStr As StringBuilder, rightStr As StringBuilder
Set sb = New StringBuilder
Set leftStr = New StringBuilder
Set rightStr = New StringBuilder
sb.Append str
column_number = 1
i = 1 ' full str
z = 0 ' full field
While sb.Length >= i
z = z + 1
If Mid(sb.Text, i, 1) = separator Then
If InStr(1, columns, "/" & column_number & "/") = 0 Then
leftStr.Append left(sb.Text, i - z)
rightStr.Append right(sb.Text, sb.Length - i)
sb.Clear
sb.Append leftStr.Text
sb.Append rightStr.Text
leftStr.Clear
rightStr.Clear
i = i - z
End If
column_number = column_number + 1
z = 0
ElseIf Mid(sb.Text, i, 1) = Chr(10) Then
If InStr(1, columns, "/" & column_number & "/") = 0 Then
leftPosition = max((i - z - 1), 0)
If leftPosition = 0 Then
leftStr.Append left(sb.Text, leftPosition)
rightStr.Append right(sb.Text, sb.Length - i)
sb.Clear
sb.Append leftStr.Text
sb.Append rightStr.Text
Else
leftStr.Append left(sb.Text, leftPosition)
rightStr.Append right(sb.Text, sb.Length - i + 1)
sb.Clear
sb.Append leftStr.Text
sb.Append rightStr.Text
End If
leftStr.Clear
rightStr.Clear
i = i - z
End If
column_number = 1
z = 0
End If
i = i + 1
Wend
SelectColumns2 = left(sb.Text, sb.Length - 1)
End Function
You can use CreateObject to create the .Net stringbuilder class. Note that you will have to have the relevant .Net library installed, and VBA does not support overloading, so it will handle a little differently than in VB.Net.
Sample code:
Public Sub TestSB()
Dim sb As Object
Set sb = CreateObject("System.Text.StringBuilder")
sb.Append_3 "Hello"
sb.Append_3 " "
sb.Append_3 "World"
sb.Append_3 "!"
Debug.Print sb.ToString
End Sub
Alternatively, you can build your own stringbuilder. This answer provides a stringbuilder class, and this question also shows some sample code.
You can - for an extremely simple implementation - use Mid.
For example, this code runs in about 0.1 ms for the quite large strings entered:
Public Function ChopString() As String
Dim Source As String
Dim LeftPart As Long
Dim RightPart As Long
Dim Result As String
Source = String(100000, "x")
LeftPart = 30000
RightPart = 40000
Result = Space(LeftPart + RightPart)
Mid(Result, 1) = Left(Source, LeftPart)
Mid(Result, 1 + LeftPart) = Right(Source, RightPart)
ChopString = Result
End Function
For smaller strings of a few K, it runs way faster.
Doing data analysis for a lab and I have a table of samples that failed and all the criteria they could've failed on. Trying to add a field with a string listing which criteria each sample failed on.
I just learned VBA 2 weeks ago so I don't really know what I'm doing. I used recordset to turn my table into an array, then looped through each record to see if each criteria has failed and add it to a new failure array if it has. Then I print the failure array in an ugly concatenated string. There are less than 100 records but it's still very slow and sometimes crashes Access. Here's my code:
Option Compare Database
Option Explicit
Dim arrFails() As Variant
Dim db As DAO.Database
Dim rs As DAO.Recordset
Dim HType As Integer
Dim S As Integer
Public Sub MakeArrs()
On Error GoTo ErrorHandler
Set db = CurrentDb
'Set rs = db.OpenRecordset("S" & HType & "RptSimple")
Set rs = db.OpenRecordset("S31RptSimple")
rs.MoveLast
rs.MoveFirst
S = rs.RecordCount - 1
Debug.Print S
Dim arrRpt() As Variant
arrRpt = rs.GetRows(S + 1)
Debug.Print arrRpt(0, 0)
'This line creates an array arrFails with sample runs as rows, and 9 columns. Each column is a failure criteria.
ReDim arrFails(0 To S, 0 To 8) As Variant
Dim i As Long
Let i = 0
Dim index As Long
'This For loop starts at the first record in arrRpt and goes across the row with an If loop for each of the failure criteria.
'If the sample failed for that criteria, it populates the new arrFails array with the name of the criteria.
'If the sample passed, that spot on the array stays null.
'At the end of one loop, we have a row that ONLY has values for the criteria that failed.
For index = 0 To S
If arrRpt(2, i) < 0.85 Or IsNull(arrRpt(2, i)) = True Then
arrFails(i, 0) = "Correlation, "
End If
If arrRpt(3, i) > -0.4 Or arrRpt(3, i) < -2 Or IsNull(arrRpt(3, i)) = True Then
arrFails(i, 1) = "Slope, "
End If
If arrRpt(4, i) < 0.5 Or arrRpt(4, i) > 100 Or IsNull(arrRpt(4, i)) = True Then
arrFails(i, 2) = "Slope_Ratio, "
End If
If arrRpt(5, i) < 2 Or IsNull(arrRpt(5, i)) = True Then
arrFails(i, 3) = "Valid_Points, "
End If
If IsNull(arrRpt(6, i)) = False Then
arrFails(i, 4) = "Fail_Code, "
End If
If arrRpt(7, i) < 1.5 Or arrRpt(7, i) > 10 Or IsNull(arrRpt(7, i)) = True Then
arrFails(i, 5) = "DilutionRatio1, "
End If
If arrRpt(8, i) < 1.5 Or arrRpt(8, i) > 10 Or IsNull(arrRpt(8, i)) = True Then
arrFails(i, 6) = "DilutionRatio2, "
End If
arrFails(i, 8) = arrRpt(0, i)
i = i + 1
Next
rs.Close
'This is error handling code, so if something goes wrong it'll gracefully exit the code instead of getting some poor user stuck in debug hell.
ExitSub:
Exit Sub
ErrorHandler:
MsgBox "There's been an error."
Resume ExitSub
Set rs = Nothing
Set db = Nothing
End Sub
Public Function FailList2(HPVType, UIDFieldname)
HType = HPVType
Call MakeArrs
Dim x As Variant
x = 0
Do While x < S + 1
If UIDFieldname = arrFails(x, 8) Then
FailList2 = arrFails(x, 1) & arrFails(x, 0) & arrFails(x, 2) & arrFails(x, 3) & arrFails(x, 4) & arrFails(x, 5) & arrFails(x, 6)
Exit Do
End If
x = x + 1
Loop
End Function
Help a newbie out? There must be a more efficient way to do this. I tried turning echo off until the end of the FailList2 function but it didn't help. Note that I need to keep 'Htype' in the function. I'm just running this on one table right now, but when I fix it I have 8 more tables to run it on, hence the rs code I commented out at the beginning.
I have that function as a field in a query
Uh-oh. There is the problem. Just open the Immediate window, open your query and watch the Debug.Print statements roll in. The function will be executed over and over again.
You need to execute the function once, write the results not into an array (arrFails), but into a table instead. Use Recordset.AddNew to add records.
Then use that table as input for your query.
Ok so i have a complex reason field from one of our logging servers, and i need to break it down to make some sense, problem is the format changes depending on the status.
I managed to find some strings that i can compare the the reason to to get some sense out of it, but I want to distill it down to one reason code.
I scratched my head a bit and got it down to 7 reasons with different criterion, put the criteria in a table and came up with some vb code to do the comparison.
Problem is its dead slow, and half the reporting relies on the Reason code. The basic VBA function is below, This basically loads the criteria into an array and then compares the value against the array to return the ID.
Function Reason_code(LongReason As String) As Integer
Dim NoReason As Integer
Dim I As Integer
Dim J As Integer
Dim x As Boolean
NoReason = recordCount("RejReason") - 1
Dim conExpr() As String
ReDim conExpr(NoReason)
For I = 0 To (NoReason - 1)
conExpr(I) = GetVal("Criterior", "RejReason", "id", CStr(I + 1))
Next I
For J = 0 To (NoReason - 1)
x = LongReason Like conExpr(J)
If x = True Then
GoTo OutOfLoop
End If
Next J
OutOfLoop:
Reason_code = J + 1
End Function
I have used similar in VB before and it tends to be quite fast, so i am reconing that my GetVal function is the problem, but my VBA is rusty and my SQL is pretty non existent, so any help would be appreciated. I tried LSQL and SQL2 as one line but VBA doesnt like it.
Function GetVal(FieldNm As String, TableNm As String, IndexField As String, IndexNo As String) As String
Dim db As Database
Dim Lrs As DAO.Recordset
Dim LSQL As String
Dim LGST As String
Dim SQL2 As String
'Open connection to current Access database
Set db = CurrentDb()
'Create SQL statement to retrieve value from GST table
LSQL = CStr("SELECT " + FieldNm + " FROM " + TableNm)
SQL2 = CStr(LSQL + " WHERE " + IndexField + " = " + IndexNo)
Set Lrs = db.OpenRecordset(SQL2, dbOpenDynaset, dbReadOnly)
'Retrieve value if data is found
If Lrs.EOF = False Then
LGST = Lrs(0)
Else
LGST = "Item Not found"
End If
Lrs.Close
Set Lrs = Nothing
GetVal = LGST
End Function
Thanks in advance,
I Scratched my head for a bit and worked out i could speed it up by doing the read and compare at the same time, its not lightning, but its better
Function ReasonCode(LongReason As String) As String
Dim cdb As Database
Dim rs As DAO.Recordset
Dim RejRea()
Dim NoReason As Integer
Dim result As Boolean
Dim i As Integer
Set cdb = CurrentDb()
Set rs = cdb.OpenRecordset("RejReason", dbOpenDynaset, dbReadOnly)
rs.MoveLast
rs.MoveFirst
NoReason = rs.recordCount - 1
RejRea() = rs.GetRows(rs.recordCount)
For i = 0 To NoReason
result = LongReason Like CStr(RejRea(2, i))
If result = True Then
ReasonCode = CStr(RejRea(1, i))
GoTo outloop
End If
Next i
If ReasonCode = "" Then ReasonCode = "Not Found"
outloop:
Set rs = Nothing
Set cdb = Nothing
End Function
Still not sure its the best way to do it, but in the abscence of any other suggestions it will do for now.