Extract data from website in a loop (changing date) to Sheet - html

First off, I'm completely new to VBA and want to build a weather statistic in Excel. Help is much appreciated!
For this I need data from multiple websites which include the information of the messured temperature, the weather station and the time.
With some VBA information I found on the internet I have so far written a code, which delivers me the needed information for one website in the immediate window only.
Which looks like this:
7.4°C | Wien-Mariabrunn (225m) | 14:00
7.6°C | Wien-Hohe Warte (198m) | 14:00
7.6°C | Wien-Unterlaa (200m) | 14:00
7.7°C | Wien-Schwechat (183m) | 14:00
7.8°C | Wien-Donaufeld (160m) | 14:00
8.1°C | Grossenzersdorf (154m) | 14:00
8.2°C | Wien-City (177m) | 14:00
Dim xmlReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Temps1 As MSHTML.IHTMLElementCollection
Dim temps2 As MSHTML.IHTMLElementCollection
Dim Temp As MSHTML.IHTMLElement
xmlReq.Open "GET", "https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html", False
xmlReq.send
If xmlReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & xmlReq.Status & " - " & xmlReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = xmlReq.responseText
Set Temps1 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-5")
Set temps2 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-1")
For Each Temp In Temps1
Debug.Print Temp.Title
Next Temp
For Each Temp In temps2
Debug.Print Temp.Title
Next Temp
First problem I have, is that I have no idea how to put this information into a sheet or cells.
The second problem is, that I need the same data from multiple websites from a startdate to enddate.
The website I used in this example-code is https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html. At the end you can find the date, in this example "20190101" & the time "1300".
So for this I need some kind of loop.
So at the end I need a worksheet with column A with the date, column B with the time and column C with the data (for each date & time).
I hope this was somehow understandable and I am very thankful for any help.

The following works nicely for short date ranges. Longer date ranges lead to slower response. It is likely website is blocking/throttling. To this end:
I include a variable pauseIndex which means that every x (=pauseIndex) number of urls, a delay of y seconds (as specified by waitSeconds) before the next request is added. You can play with this.
You might consider adapting code to run in daily/monthly batches and either append to bottom of existing dataset or write to a new sheet using the current day/month as title.
Perhaps rotate/change IPs and use MSXML2.ServerXMLHTTP.
Tinker with the above to get the optimal settings and batch request size (if doing batch).
Specifying start datetime and end datetimes:
The date ranges are specified in a sheet called Date ranges. It has the following set-up:
Building a dataset:
I would recommend building a flat table dataset where you specify the exact stations to return info for. Not each station appears for every timedate.
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City")
You can expand this. I include a dictionary variable newStations which stores all stations encountered that are not in your list for monitoring. You can easily write these out to help decide on additional stations to monitor/include in your dataset.
Placeholder values for missing station readings are used to ensure a complete dataset.
You may wish to normalise “outliers” - for example, actual hour values can lie within the range rather than on the hour. In the demo below 16:20 is retrieved accurately for one station. You could normalise this to 16:00.
Helper functions/Sub:
There are a number of helper functions, and 1 sub, used in the code.
GetAllLinks. Generates all request urls between start date time and end date time. See notes within code. These can be looped to issue each request for data
EmptyDict - ensure that station data is cleared out between requests
UpdateDictForNoReading. Handles the case where a monitored station is not reported for specified datetime. It updates temperature and long station description with "No reading"
WriteOutResults. Produces a "flat" i.e. not nested, 2D array structure and writes results to specified output sheet
Retrieving stations and station data:
I use a css attribute = value selector, with contains operator, to target the station data.
Taking a example station's HTML
<a class="ap o o-1 o-tmp--1" data-target="#obs-detail-3h" data-toggle="modal" data-left="635" data-top="545" onclick="obs_detail_3h('-1.0°C', 'Wien-Schwechat (183m)', '16:20','110360', '201901031500');" title="-1.0°C | Wien-Schwechat (183m) | 16:20" style="left: 408.533px; top: 337.757px;">-1</a>
If we look at the class attribute we see that is it as follows:
class="ap o o-1 o-tmp--1"
The value of the class attribute is "ap o o-1 o-tmp--1", which is in fact a series of classes separated by spaces. Each station class value has the same substring which is o-tmp. You could vary this slightly. I use querySelectorAll to return a nodeList of all elements which have this substring in the class attribute value.
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
This matches all stations on the page (map).
The title attribute of each node in the nodeList (mapStations) contains the data of interest:
title="-1.0°C | Wien-Schwechat (183m) | 16:20"
The string contains pipe (|) delimiters. I can use split() to generate an array containing each bit of info:
arr = Split(mapStations.item(i).Title, " | ")
That generates an array which will have -1.0°C , Wien-Schwechat (183m) and 16:20 at different indices. Using this example, I store -1.0°C in variable temp, Wien-Schwechat (183m) in variable stationFull, just the station name Wien-Schwechatin station, 16:20 in time.
TODO:
Refactor to reduce level of nesting
Error handling for cases such as status code <> 200.....
Move variable declarations closer to their usage
Requirements:
VBE > Tools > References > Add reference to Microsoft HTML Object Library
Worksheet called Date ranges
Worksheet called Output
Data in Date ranges should be laid out as shown in image above.
VBA:
Option Explicit
Public Sub GetInfo()
'VBE > Tools > References > Microsoft HTML Object Library
'Collect hourly temperature readings from list of stations in array stations.
'Missing readings are populated with "Missing reading". Times specified in request are not necessarily identical _
'to that correctly returned from page as reading reported within an hour interval may not be on the hour
Dim html As HTMLDocument, i As Long, arr() As String, mapStations As Object, dict As Object, newStations As Object
Dim time As String, station As String, temp As String, stations(), results(), j As Long
Dim urls As Object, url As Variant, startOfDateString As Long, currDate As String, stationFull As String
Dim outputSht As Worksheet, x As Long
Const pauseIndex As Long = 20
Const waitSeconds As Long = 1
Const PREFIX As String = "https://kachelmannwetter.com/at/messwerte/wien/temperatur/"
Const SUFFIX As String = "z.html"
startOfDateString = InStrRev(PREFIX, "/") + 1
Set outputSht = ThisWorkbook.Worksheets("Output")
Set urls = GetAllLinks(PREFIX, SUFFIX)
Set html = New HTMLDocument
Set dict = CreateObject("Scripting.Dictionary")
Set newStations = CreateObject("Scripting.Dictionary")
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City") 'order of stations here should match that in sheet
j = 1
For i = LBound(stations) To UBound(stations)
dict(stations(i)) = vbNullString
Next
ReDim results(1 To 1 * urls.Count)
With CreateObject("MSXML2.XMLHTTP")
For Each url In urls
x = x + 1
If x Mod pauseIndex = 0 Then Application.Wait Now + TimeSerial(0, 0, waitSeconds)
DoEvents
.Open "GET", url, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
For i = 0 To mapStations.Length - 1
arr = Split(mapStations.item(i).Title, " | ")
currDate = Join(Array(Mid$(url, startOfDateString + 4, 2), Mid$(url, startOfDateString + 6, 2), Mid$(url, startOfDateString, 4)), "-")
temp = arr(0)
station = Split(arr(1), " (")(0)
stationFull = arr(1)
time = arr(2)
If Not dict.Exists(station) Then
newStations(station) = vbNullString 'in case you are interested in which stations exist you are not monitoring
Else
dict(station) = Array(currDate, time, station, stationFull, temp)
End If
Next
Set dict = UpdateDictForNoReading(dict, currDate, time)
results(j) = dict.items
j = j + 1
Set dict = EmptyDict(dict)
Next
End With
WriteOutResults outputSht, results, UBound(stations) + 1
End Sub
Public Function UpdateDictForNoReading(ByVal dict As Object, ByVal currDate As String, ByVal time As String) As Object
'Loop dictionary containing station readings. If current value for key is not an array then no readings where found. _
'then dict is updated with "No reading" for station long text (which includes m e.g. Wien-Schwechat (183m)) and temperature
Dim key As Variant
For Each key In dict
If Not IsArray(dict(key)) Then dict(key) = Array(currDate, time, key, "No reading", "No reading")
Next
Set UpdateDictForNoReading = dict
End Function
Public Sub WriteOutResults(ByVal ws As Worksheet, ByRef results As Variant, ByVal stationCount As Long)
'Loop results array which at each index should have a child array which is comprised of all stations specified _
'The code unravels the nested structure into "flat" array for writing out to sheet. Aim is to be more efficient _
'with writing out to sheet
'The sheet to write results to is passed as argument ws. Headers are stated below.
Dim headers(), outputArr(), i As Long, arr(), j As Long, r As Long, c As Long
headers = Array("Date", "Time", "Station", "StationFull", "Temp")
ReDim outputArr(1 To UBound(results) * stationCount, 1 To UBound(headers) + 1)
For i = LBound(results) To UBound(results)
arr = results(i) '0-6
For j = LBound(arr) To UBound(arr)
r = r + 1
If IsArray(arr(j)) Then
For c = LBound(arr(j)) To UBound(arr(j))
outputArr(r, c + 1) = arr(j)(c)
Next
End If
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(outputArr, 1), UBound(outputArr, 2)) = outputArr
End With
End Sub
Public Function EmptyDict(ByVal dict As Object) As Object
'Ensures no data remains from prior request
Dim key As Variant
For Each key In dict
dict(key) = vbNullString
Next
Set EmptyDict = dict
End Function
Public Function GetAllLinks(ByVal PREFIX As String, ByVal SUFFIX As String) As Object
'Generate all urls between start date time and end date time. Accounts for fact that there is no real midnight. 00:00 uhr retrieves the 23:00 of prior day.
'Times selected on page e.g. 1:00 uhr are one hour ahead of what is used in url string e.g. 1 > 0.
Dim ws As Worksheet, hours(), urls As Collection
Set urls = New Collection
Set ws = ThisWorkbook.Worksheets("Date ranges")
'url "0000" = 1am. The selection of 00:00 in the sheet gives 23:00 of prior date
hours = Array("0000", "0100", "0200", "0300", "0400", "0500", "0600", "0700", "0800", "0900", "1000", "1100", "1200", _
"1300", "1400", "1500", "1600", "1700", "1800", "1900", "2000", "2100", "2200", "2300")
Dim startDate As Date, endDate As Date, startTime As String, endTime As String, currentDate As Date
Dim endIndex As Long, startIndex As Long
With ws
startDate = .Cells(1, 2).Value2 'Required for input yyyy-mm-dd; Required for output yyyymmdd
endDate = .Cells(1, 5).Value2
startTime = .Cells(2, 2)
endTime = .Cells(2, 5)
End With
startIndex = Application.Match(startTime, hours) - 2
endIndex = Application.Match(endTime, hours) - 2
currentDate = startDate
Dim i As Long, s As Long, e As Long
Do While currentDate <= endDate
If startDate = endDate Then
s = startIndex
e = endIndex
Else
Select Case currentDate
Case startDate
s = startIndex
e = UBound(hours)
Case endDate
s = LBound(hours)
e = endIndex
Case Else
s = LBound(hours)
e = UBound(hours)
End Select
End If
For i = s To e
urls.Add PREFIX & Format$(currentDate, "yyyymmdd") & "-" & hours(i) & SUFFIX
Next
currentDate = DateAdd("d", 1, currentDate)
Loop
Set GetAllLinks = urls
End Function
Example output:

Related

What Discrete Optimization family is this?

I am given N lists of M items that will be physically realized (someone actually has to put items (names abbreviated here,) in physical bins.) Then the bins are emptied, if necessary, and re-used, working left-to-right. There is a real cost to putting a different item in a bin than what was in it before. I rearrange the lists to minimize changes, manually. Software can do it faster, and more reliably in an optimum way. The whole thing happens in Excel (then paper, then in a factory.) I wrote some VBA, a brute-force affair, that did really well with some examples. But not all. If I knew the family of optimization that this is, I could code it, even if I just pass something to a DLL. But multiple searches online have not succeeded. I tried several phrasings. It's not a traveling S.., knapsack, etc. It seems similar to the Sequence Alignment problem from Bioinformatics. Someone recognize it? Let's hear it, Operations Research people.
As it turns out, the naive solution just needed tweaking. Look at a cell. Try to find the same letter in the column to it's right. If you find one, swap it with whatever it to the right of that cell now. Work your way down. The ColumnsPer parameter accounts for the real-world use, where each column has an associated list of numbers and the grid columns alternate labels, numbers, labels, ...
Option Explicit
Public Const Row1 As Long = 4
Public Const ColumnsPer As Long = 1 '2, when RM, %
Public Const BinCount As Long = 6
Public Const ColCount As Long = 6
Private Sub reorder_items_max_left_to_right_repeats(wksht As Worksheet, _
col1 As Long, maxBins As Long, maxRecipes As Long, ByVal direction As Integer)
Dim here As Range
Set here = wksht.Cells(Row1, col1)
here.Activate
Dim cond
For cond = 1 To maxRecipes - 1
Do While WithinTheBox(here, col1, direction)
If Not Adjacent(here, ColumnsPer).Value = here.Value Then
Dim there As Range
Set there = Matching_R_ange(here, direction)
If Not there Is Nothing Then swapThem Adjacent(here, ColumnsPer), there
End If
NextItemDown:
Set here = here.Offset(direction, 0)
here.Activate
'Debug.Assert here.Address <> "$AZ$6"
DoEvents
Loop
NextCond:
Select Case direction
Case 1
Set here = Cells(Row1, here.Column + ColumnsPer)
Case -1
Set here = Cells(Row1 + maxBins - 1, here.Column + ColumnsPer)
End Select
here.Activate
Next cond
End Sub
Function Adjacent(fromHereOnLeft As Range, colsRight As Long) As Range
Set Adjacent = fromHereOnLeft.Offset(0, colsRight)
End Function
Function Matching_R_ange(fromHereOnLeft As Range, _
ByVal direction As Integer) As Range
Dim rowStart As Long
rowStart = Row1
Dim colLook As Long
colLook = fromHereOnLeft.Offset(0, ColumnsPer).Column
Dim c As Range
Set c = Cells(rowStart, colLook)
Dim col1 As Long
col1 = c.Column
Do While WithinTheBox(c, col1, direction)
Debug.Print "C " & c.Address
If c.Value = fromHereOnLeft.Value _
And c.Row <> fromHereOnLeft.Row Then
Set Matching_R_ange = c
Exit Function
Else
Set c = c.Offset(1 * direction, 0)
End If
DoEvents
Loop
'returning NOTHING is expected, often
End Function
Function WithinTheBox(ByVal c As Range, ByVal col1 As Long, ByVal direction As Integer)
Select Case direction
Case 1
WithinTheBox = c.Row <= Row1 + BinCount - 1 And c.Row >= Row1
Case -1
WithinTheBox = c.Row <= Row1 + BinCount - 1 And c.Row > Row1
End Select
WithinTheBox = WithinTheBox And _
c.Column >= col1 And c.Column < col1 + ColCount - 1
End Function
Private Sub swapThem(range10 As Range, range20 As Range)
'Unlike with SUB 'Matching_R_ange', we have to swap the %s as well as the items
'So set temporary range vars to hold %s, to avoid confusion due to referencing items/r_anges
If ColumnsPer = 2 Then
Dim range11 As Range
Set range11 = range10.Offset(0, 1)
Dim range21 As Range
Set range21 = range20.Offset(0, 1)
'sit on them for now
End If
Dim Stak As Object
Set Stak = CreateObject("System.Collections.Stack")
Stak.push (range10.Value) 'A
Stak.push (range20.Value) 'BA
range10.Value = Stak.pop 'A
range20.Value = Stak.pop '_ Stak is empty now, can re-use
If ColumnsPer = 2 Then
Stak.push (range11.Value)
Stak.push (range21.Value)
range11.Value = Stak.pop
range21.Value = Stak.pop
End If
End Sub

ACCESS sql code to multiply record with calculated value from previous cell

I am using Access Database to get a value. I am fairly new to access as I usually use SQLServer and I am having trouble in getting what I want.
I have the following table, with column TARGET and incremental Target as the target column that I need to get:
Category|Period|Value| TARGET |
A | 4 | 1 | 1/1 =1 |
A | 3 | 3 | 1/(3*1)=0.33 | (1/value at period 3 * previous target)
A | 2 | 6 |1/(0.33*6)=0.505|
A | 1 | 9 |1/(0.505*9)=0.22|
The data is partitioned by Category and ordered in descending order by Period.
For the first row the Target should be: (1/value at current period)
For the next rows the Target should be: (1/value at current period * value of previous target)
As you can see this is somehow complex as I need to evaluate a cell value and then for the next row I need to use the value in the cell above.
Plus I need to get the incremental value for this column as well.
Any help will be very much appreciated as I am new to Access and need to get this done soon!
Here is a function placed in general module that can be called from query. Value is a reserved word so I used Data for that field name.
Option Compare Database
Option Explicit
Global dblTar As Double
Global strCat As String
____________
Function CalcTarget(strC As String, dblT As Double) As Double
If strCat <> strC Then
strCat = strC
dblTar = dblT
End If
dblTar = 1 / (dblT * dblTar)
CalcTarget = dblTar
End Function
Calling function in query:
SELECT Category, Period, Data, CalcTarget([Category],[Data]) AS Target FROM Table1;
Normally I advise not to save calculated data to table when a query can work, but if you prefer to save, then options are:
An UPDATE action: UPDATE Table1 SET Target = CalcTarget([Category],[Data]);
Or VBA:
Sub CalcTarget()
Dim rs As DAO.Recordset
Dim strCat As String
Dim dblTar As Double
Set rs = CurrentDb.OpenRecordset("SELECT * FROM table1 ORDER BY Category, Period DESC")
strCat = rs!Category
dblTar = rs!Data
Do While Not rs.EOF
If rs!Category = strCat Then
dblTar = 1 / (rs!Data * dblTar)
rs.Edit
rs!Target = dblTar
rs.Update
rs.MoveNext
Else
strCat = rs!Category
dblTar = rs!Data
End If
Loop
End Sub

vb.net + mysql - Search table for top 5 rows that are the most similar to input values

I have a Database with many columns, one of them containing Names.
My vb.net software acts as telegram server and waits for the user to send its full name.
The database could have its name spelled differently, for example "Marco Dell'Orso" could be spelled "Marco Dellorso" or "Marco Dell Orso" od "Dell Orso Marco" or whatever. The user could also misspell his name and invert two letters.. for esample "MaCRo Dell'Orso"
I would need a way to return the 5 rows that are the most similar to the words used in the query. What would be the best way? I was thinking of splitting the name on whitechars and then use LIKE in the query with the single words, but that does not work with mistyped words.
EDIT:
My current plan is to that if the database contains more than one or less then one rows with the exact name, then split the input into the single words and return all strings that contain ANY of the input words. this should reduce the rows to analyze from 42000 to a few hundred. Once I have these few hundred lines, i could run a Levenshtein function on the rows and return the 5 most matching..
Is this a good idea?
Solved it this way by combining my custom function with a premade Levenshtein function from this link: How to calculate distance similarity measure of given 2 strings? . I assign a score for each single word that appears in the other wordcomplex. then I add a score based on the Levenshtein comparison of each word to another. works great:
Public Class Form1
Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp
calc()
End Sub
Private Sub TextBox2_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox2.KeyUp
calc()
End Sub
Sub calc()
Label1.Text = compare(TextBox1.Text, TextBox2.Text)
End Sub
Public Function compare(source As String, target As String) As Integer
Dim score As Double
Dim sourcewords As String() = source.Split(New Char() {" "c, "'"c, "`"c, "´"c})
Dim targetwords As String() = target.Split(New Char() {" "c, "'"c, "`"c, "´"c})
For Each s In sourcewords
If target.Contains(s) Then score = score + 1
For Each t In targetwords
score = score + 1 / (DamerauLevenshteinDistance(s, t, 100) + 1)
Next
Next
For Each s In targetwords
If source.Contains(s) Then score = score + 1
For Each t In sourcewords
score = score + 1 / (DamerauLevenshteinDistance(s, t, 100) + 1)
Next
Next
Return score
End Function
''' <summary>
''' Computes the Damerau-Levenshtein Distance between two strings, represented as arrays of
''' integers, where each integer represents the code point of a character in the source string.
''' Includes an optional threshhold which can be used to indicate the maximum allowable distance.
''' </summary>
''' <param name="source">An array of the code points of the first string</param>
''' <param name="target">An array of the code points of the second string</param>
''' <param name="threshold">Maximum allowable distance</param>
''' <returns>Int.MaxValue if threshhold exceeded; otherwise the Damerau-Leveshteim distance between the strings</returns>
Public Shared Function DamerauLevenshteinDistance(source As String, target As String, threshold As Integer) As Integer
Dim length1 As Integer = source.Length
Dim length2 As Integer = target.Length
' Return trivial case - difference in string lengths exceeds threshhold
If Math.Abs(length1 - length2) > threshold Then
Return Integer.MaxValue
End If
' Ensure arrays [i] / length1 use shorter length
If length1 > length2 Then
Swap(target, source)
Swap(length1, length2)
End If
Dim maxi As Integer = length1
Dim maxj As Integer = length2
Dim dCurrent As Integer() = New Integer(maxi) {}
Dim dMinus1 As Integer() = New Integer(maxi) {}
Dim dMinus2 As Integer() = New Integer(maxi) {}
Dim dSwap As Integer()
For i As Integer = 0 To maxi
dCurrent(i) = i
Next
Dim jm1 As Integer = 0, im1 As Integer = 0, im2 As Integer = -1
For j As Integer = 1 To maxj
' Rotate
dSwap = dMinus2
dMinus2 = dMinus1
dMinus1 = dCurrent
dCurrent = dSwap
' Initialize
Dim minDistance As Integer = Integer.MaxValue
dCurrent(0) = j
im1 = 0
im2 = -1
For i As Integer = 1 To maxi
Dim cost As Integer = If(source(im1) = target(jm1), 0, 1)
Dim del As Integer = dCurrent(im1) + 1
Dim ins As Integer = dMinus1(i) + 1
Dim [sub] As Integer = dMinus1(im1) + cost
'Fastest execution for min value of 3 integers
Dim min As Integer = If((del > ins), (If(ins > [sub], [sub], ins)), (If(del > [sub], [sub], del)))
If i > 1 AndAlso j > 1 AndAlso source(im2) = target(jm1) AndAlso source(im1) = target(j - 2) Then
min = Math.Min(min, dMinus2(im2) + cost)
End If
dCurrent(i) = min
If min < minDistance Then
minDistance = min
End If
im1 += 1
im2 += 1
Next
jm1 += 1
If minDistance > threshold Then
Return Integer.MaxValue - 1
End If
Next
Dim result As Integer = dCurrent(maxi)
Return If((result > threshold), Integer.MaxValue, result)
End Function
Private Shared Sub Swap(Of T)(ByRef arg1 As T, ByRef arg2 As T)
Dim temp As T = arg1
arg1 = arg2
arg2 = temp
End Sub
End Class
One way is to use the build-in soundex function of MySQL.
SELECT SOUNDEX(name) FROM table;
Or, the better way, there are a few MySQL-functions on the web implementing DoubleMetaphone. I think this is what you are searching:
GitHub

Value used in formula of wrong data type

I have been trying to figure this error out for the past few days with no luck. I am hoping one of you would be able to help. I am getting "value used in formula of wrong data type.
Quick explanation:
convert functions like this one to its corresponding text (20054/18393)*100.0
the 5 digit numbers are Field IDs that refer to questions.
ID Question
20054 How many days of year do you work
18393 How many days of vacation do you get a year
The result I am trying to get to is (How many days of year do you work / How many days of vacation do you get a year) *100.0
It could be easily done manually if it was just a hand full. I have over 2600 formulas that need to be converted.
I created this function below which is resulting in the error mentioned in the title. Any assistance would be greatly appreciated
Here is my function
Function Test(sInput As String) As String
Dim i As Long
Dim num As String
Dim Text, a, str, shortname As String
For i = 1 To Len(sInput)
a = Mid(sInput, i, 1)
If IsNumeric(a) Then
num = num & a
Text = ""
Else
If a = "." Then
num = num & a
Else
'search for num value in second sheet short name
shortname = WorksheetFunction.VLookup(WorksheetFunction.Int(num), Worksheets("questionlist").Range("A3:F2537"), 5, False)
num = ""
End If
Text = shortname & a
shortname = ""
End If
str = str & Text
Next
Test = str
End Function
The error is raised because you are passing blank value to INT Function in the line
WorksheetFunction.VLookup(WorksheetFunction.Int(num), Worksheets("questionlist").Range("A3:F2537"), 5, False)
To reproduce the error Type =INT("") in any cell
To fix this handle blank values
Updated Answer:
Function Formula2Text(ByRef myCell As Range) As String
Dim QuestionId As Integer
Dim strInput As String
'Get Formula instead of values
strInput = myCell.FormulaR1C1
'Use Regex to Catch all ID's
Set Regex = CreateObject("VBScript.RegExp")
Set rnglookup = Worksheets("questionlist").Range("A3:F2537")
Regex.Global = True
Regex.Pattern = "\d+"
For Each Match In Regex.Execute(strInput)
'Skip if the ID is 100
If (Match.Value <> 100) Then QuestionId = Match.Value
'Lookup ID in the rnglookup,Make sure the Ids are sorted in asc in the questionlist sheet
Qntxt = Application.VLookup(QuestionId, rnglookup, 5, False)
If IsError(Qntxt) Then Qntxt = "Missing Lookup"
'Replace the ID with the lookup
strInput = Replace(strInput, QuestionId, Qntxt)
Next
Formula2Text = strInput
End Function
Usage:In the cell next to the formula use the function by referencing the formula
=Formula2Text(A1)

how to import 2 row from txt or EXCEL into the same ROW in SQL server

I need to extract information from a page I have access to.
Just in this module, I have no means to export, but just to copying and pasting the information
Looks like this in the same l
1. MANUF MODEL YEAR MFG SERIAL REG OS DATE DESC LISTED1. YEAR DLV
2. monster 4r25 1988 23547248 Waka001 7/23/2012 For sale 7/22/20092. 1989
3. FORD 12SE 1994 6262552 DBZRLZ0 7/26/2012 For sale 7/9/20093. 1994
I'm getting my data in rows, but the year mfg and year dlv is in 2 rows within one row (or 2 rows in the same field). When pasted on excel it makes 2 rows first with all the data in the row including year mng and a second row just for year dlv (in the same column).
I can parse this information in excel by adding extra column and coping that extra field and deleting blanks and so on. But I want to omit the excel part and import this from a TXT file which when pasted creates 2 rows per row as well and using tabs as delimiter (like txt text tab delimited).
When I import with bulk insert, it imports twice as much rows, but I can't imagine a way to parse this second row into a new column.
Can someone help with this? In t-sql (every row has only one row of info, but in the column year mfg /year dlv, comes with two rows).
Or point me on what to read or which would be a better approach? Maybe importing 2 rows at once ETC.
You can import the data set from the text file into a temp table including the blank lines. This will give you a data set in SQL with 2 types of records. 1. Records that have all data except delivery date. 2. Records that have only delivery dates and no other fields. (Add a unique auto increment key)
Because the related records will be one record apart, Record N and Records N+1 are actually the same record.
Then a select query Joining the temp table to its self by RecID = RecId+1 will give a complete record with all fields
SELECT * FROM tmpTable AS MainRecord INNER JOIN tmpTable AS MissingField ON MainRecord.RecId = MissingField.RecId +1
From this dataset you can instert into your main data.
Do you know how to use VBA? You can run this code (FixData()) in Excel before you use it in TSQL so it fixes the extra row problem. Hope this helps
Option Explicit
Public Sub FixData()
Dim ws As Excel.Worksheet
Dim iCurRow As Long
Dim iLastRow As Long
Set ws = ActiveSheet
iLastRow = ws.Cells.SpecialCells(xlCellTypeLastCell).Row
' move through the spreadsheet from bottom to top
For iCurRow = iLastRow To 1 Step -1
If (isCurrentRowMessedUp(ws, iCurRow) = True) Then
Call AppendDataToPreviousRow(ws, iCurRow)
' delete the row since we moved the data out of there
ws.Rows(iCurRow).EntireRow.Delete
End If
Next
End Sub
Private Sub AppendDataToPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long)
Dim firstCellInRow As Excel.Range
Dim lastCellInRow As Excel.Range
Dim previousRowRangeToPaste As Excel.Range
' check if first column has data in it, otherwise find the first column that has data
If (ws.Cells(currentRow, 1).Value = vbNullString) Then
Set firstCellInRow = ws.Cells(currentRow, 1).End(xlToRight)
Else
Set firstCellInRow = ws.Cells(currentRow, 1)
End If
Set lastCellInRow = ws.Cells(currentRow, ws.Columns.Count).End(xlToLeft)
Set previousRowRangeToPaste = ws.Cells(currentRow - 1, getNextColumnAvailableInPreviousRow(ws, currentRow))
ws.Range(firstCellInRow, lastCellInRow).Cut previousRowRangeToPaste
End Sub
Private Function isCurrentRowMessedUp(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Boolean
Dim cellCountInRow As Long
Dim firstCellInRow As Excel.Range
Dim lastCellInRow As Excel.Range
Set firstCellInRow = ws.Cells(currentRow, 1)
Set lastCellInRow = ws.Cells(currentRow, ws.Columns.Count).End(xlToLeft)
cellCountInRow = Application.WorksheetFunction.CountA(ws.Range(firstCellInRow, lastCellInRow))
If (cellCountInRow <= 1) Then
isCurrentRowMessedUp = True
Else
isCurrentRowMessedUp = False
End If
End Function
Private Function getLastColumnInPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Long
Dim rng As Excel.Range
Set rng = ws.Cells(currentRow - 1, 1).End(xlToRight)
getLastColumnInPreviousRow = rng.Column
End Function
Private Function getNextColumnAvailableInPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Long
getNextColumnAvailableInPreviousRow = getLastColumnInPreviousRow(ws, currentRow) + 1
End Function
you can use SQL Server Integration Service (SSIS) for convert data from any source data such as excel to any destination data such as SQL Server