Extract data from website in a loop (changing date) to Sheet - html
First off, I'm completely new to VBA and want to build a weather statistic in Excel. Help is much appreciated!
For this I need data from multiple websites which include the information of the messured temperature, the weather station and the time.
With some VBA information I found on the internet I have so far written a code, which delivers me the needed information for one website in the immediate window only.
Which looks like this:
7.4°C | Wien-Mariabrunn (225m) | 14:00
7.6°C | Wien-Hohe Warte (198m) | 14:00
7.6°C | Wien-Unterlaa (200m) | 14:00
7.7°C | Wien-Schwechat (183m) | 14:00
7.8°C | Wien-Donaufeld (160m) | 14:00
8.1°C | Grossenzersdorf (154m) | 14:00
8.2°C | Wien-City (177m) | 14:00
Dim xmlReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Temps1 As MSHTML.IHTMLElementCollection
Dim temps2 As MSHTML.IHTMLElementCollection
Dim Temp As MSHTML.IHTMLElement
xmlReq.Open "GET", "https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html", False
xmlReq.send
If xmlReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & xmlReq.Status & " - " & xmlReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = xmlReq.responseText
Set Temps1 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-5")
Set temps2 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-1")
For Each Temp In Temps1
Debug.Print Temp.Title
Next Temp
For Each Temp In temps2
Debug.Print Temp.Title
Next Temp
First problem I have, is that I have no idea how to put this information into a sheet or cells.
The second problem is, that I need the same data from multiple websites from a startdate to enddate.
The website I used in this example-code is https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html. At the end you can find the date, in this example "20190101" & the time "1300".
So for this I need some kind of loop.
So at the end I need a worksheet with column A with the date, column B with the time and column C with the data (for each date & time).
I hope this was somehow understandable and I am very thankful for any help.
The following works nicely for short date ranges. Longer date ranges lead to slower response. It is likely website is blocking/throttling. To this end:
I include a variable pauseIndex which means that every x (=pauseIndex) number of urls, a delay of y seconds (as specified by waitSeconds) before the next request is added. You can play with this.
You might consider adapting code to run in daily/monthly batches and either append to bottom of existing dataset or write to a new sheet using the current day/month as title.
Perhaps rotate/change IPs and use MSXML2.ServerXMLHTTP.
Tinker with the above to get the optimal settings and batch request size (if doing batch).
Specifying start datetime and end datetimes:
The date ranges are specified in a sheet called Date ranges. It has the following set-up:
Building a dataset:
I would recommend building a flat table dataset where you specify the exact stations to return info for. Not each station appears for every timedate.
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City")
You can expand this. I include a dictionary variable newStations which stores all stations encountered that are not in your list for monitoring. You can easily write these out to help decide on additional stations to monitor/include in your dataset.
Placeholder values for missing station readings are used to ensure a complete dataset.
You may wish to normalise “outliers” - for example, actual hour values can lie within the range rather than on the hour. In the demo below 16:20 is retrieved accurately for one station. You could normalise this to 16:00.
Helper functions/Sub:
There are a number of helper functions, and 1 sub, used in the code.
GetAllLinks. Generates all request urls between start date time and end date time. See notes within code. These can be looped to issue each request for data
EmptyDict - ensure that station data is cleared out between requests
UpdateDictForNoReading. Handles the case where a monitored station is not reported for specified datetime. It updates temperature and long station description with "No reading"
WriteOutResults. Produces a "flat" i.e. not nested, 2D array structure and writes results to specified output sheet
Retrieving stations and station data:
I use a css attribute = value selector, with contains operator, to target the station data.
Taking a example station's HTML
<a class="ap o o-1 o-tmp--1" data-target="#obs-detail-3h" data-toggle="modal" data-left="635" data-top="545" onclick="obs_detail_3h('-1.0°C', 'Wien-Schwechat (183m)', '16:20','110360', '201901031500');" title="-1.0°C | Wien-Schwechat (183m) | 16:20" style="left: 408.533px; top: 337.757px;">-1</a>
If we look at the class attribute we see that is it as follows:
class="ap o o-1 o-tmp--1"
The value of the class attribute is "ap o o-1 o-tmp--1", which is in fact a series of classes separated by spaces. Each station class value has the same substring which is o-tmp. You could vary this slightly. I use querySelectorAll to return a nodeList of all elements which have this substring in the class attribute value.
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
This matches all stations on the page (map).
The title attribute of each node in the nodeList (mapStations) contains the data of interest:
title="-1.0°C | Wien-Schwechat (183m) | 16:20"
The string contains pipe (|) delimiters. I can use split() to generate an array containing each bit of info:
arr = Split(mapStations.item(i).Title, " | ")
That generates an array which will have -1.0°C , Wien-Schwechat (183m) and 16:20 at different indices. Using this example, I store -1.0°C in variable temp, Wien-Schwechat (183m) in variable stationFull, just the station name Wien-Schwechatin station, 16:20 in time.
TODO:
Refactor to reduce level of nesting
Error handling for cases such as status code <> 200.....
Move variable declarations closer to their usage
Requirements:
VBE > Tools > References > Add reference to Microsoft HTML Object Library
Worksheet called Date ranges
Worksheet called Output
Data in Date ranges should be laid out as shown in image above.
VBA:
Option Explicit
Public Sub GetInfo()
'VBE > Tools > References > Microsoft HTML Object Library
'Collect hourly temperature readings from list of stations in array stations.
'Missing readings are populated with "Missing reading". Times specified in request are not necessarily identical _
'to that correctly returned from page as reading reported within an hour interval may not be on the hour
Dim html As HTMLDocument, i As Long, arr() As String, mapStations As Object, dict As Object, newStations As Object
Dim time As String, station As String, temp As String, stations(), results(), j As Long
Dim urls As Object, url As Variant, startOfDateString As Long, currDate As String, stationFull As String
Dim outputSht As Worksheet, x As Long
Const pauseIndex As Long = 20
Const waitSeconds As Long = 1
Const PREFIX As String = "https://kachelmannwetter.com/at/messwerte/wien/temperatur/"
Const SUFFIX As String = "z.html"
startOfDateString = InStrRev(PREFIX, "/") + 1
Set outputSht = ThisWorkbook.Worksheets("Output")
Set urls = GetAllLinks(PREFIX, SUFFIX)
Set html = New HTMLDocument
Set dict = CreateObject("Scripting.Dictionary")
Set newStations = CreateObject("Scripting.Dictionary")
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City") 'order of stations here should match that in sheet
j = 1
For i = LBound(stations) To UBound(stations)
dict(stations(i)) = vbNullString
Next
ReDim results(1 To 1 * urls.Count)
With CreateObject("MSXML2.XMLHTTP")
For Each url In urls
x = x + 1
If x Mod pauseIndex = 0 Then Application.Wait Now + TimeSerial(0, 0, waitSeconds)
DoEvents
.Open "GET", url, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
For i = 0 To mapStations.Length - 1
arr = Split(mapStations.item(i).Title, " | ")
currDate = Join(Array(Mid$(url, startOfDateString + 4, 2), Mid$(url, startOfDateString + 6, 2), Mid$(url, startOfDateString, 4)), "-")
temp = arr(0)
station = Split(arr(1), " (")(0)
stationFull = arr(1)
time = arr(2)
If Not dict.Exists(station) Then
newStations(station) = vbNullString 'in case you are interested in which stations exist you are not monitoring
Else
dict(station) = Array(currDate, time, station, stationFull, temp)
End If
Next
Set dict = UpdateDictForNoReading(dict, currDate, time)
results(j) = dict.items
j = j + 1
Set dict = EmptyDict(dict)
Next
End With
WriteOutResults outputSht, results, UBound(stations) + 1
End Sub
Public Function UpdateDictForNoReading(ByVal dict As Object, ByVal currDate As String, ByVal time As String) As Object
'Loop dictionary containing station readings. If current value for key is not an array then no readings where found. _
'then dict is updated with "No reading" for station long text (which includes m e.g. Wien-Schwechat (183m)) and temperature
Dim key As Variant
For Each key In dict
If Not IsArray(dict(key)) Then dict(key) = Array(currDate, time, key, "No reading", "No reading")
Next
Set UpdateDictForNoReading = dict
End Function
Public Sub WriteOutResults(ByVal ws As Worksheet, ByRef results As Variant, ByVal stationCount As Long)
'Loop results array which at each index should have a child array which is comprised of all stations specified _
'The code unravels the nested structure into "flat" array for writing out to sheet. Aim is to be more efficient _
'with writing out to sheet
'The sheet to write results to is passed as argument ws. Headers are stated below.
Dim headers(), outputArr(), i As Long, arr(), j As Long, r As Long, c As Long
headers = Array("Date", "Time", "Station", "StationFull", "Temp")
ReDim outputArr(1 To UBound(results) * stationCount, 1 To UBound(headers) + 1)
For i = LBound(results) To UBound(results)
arr = results(i) '0-6
For j = LBound(arr) To UBound(arr)
r = r + 1
If IsArray(arr(j)) Then
For c = LBound(arr(j)) To UBound(arr(j))
outputArr(r, c + 1) = arr(j)(c)
Next
End If
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(outputArr, 1), UBound(outputArr, 2)) = outputArr
End With
End Sub
Public Function EmptyDict(ByVal dict As Object) As Object
'Ensures no data remains from prior request
Dim key As Variant
For Each key In dict
dict(key) = vbNullString
Next
Set EmptyDict = dict
End Function
Public Function GetAllLinks(ByVal PREFIX As String, ByVal SUFFIX As String) As Object
'Generate all urls between start date time and end date time. Accounts for fact that there is no real midnight. 00:00 uhr retrieves the 23:00 of prior day.
'Times selected on page e.g. 1:00 uhr are one hour ahead of what is used in url string e.g. 1 > 0.
Dim ws As Worksheet, hours(), urls As Collection
Set urls = New Collection
Set ws = ThisWorkbook.Worksheets("Date ranges")
'url "0000" = 1am. The selection of 00:00 in the sheet gives 23:00 of prior date
hours = Array("0000", "0100", "0200", "0300", "0400", "0500", "0600", "0700", "0800", "0900", "1000", "1100", "1200", _
"1300", "1400", "1500", "1600", "1700", "1800", "1900", "2000", "2100", "2200", "2300")
Dim startDate As Date, endDate As Date, startTime As String, endTime As String, currentDate As Date
Dim endIndex As Long, startIndex As Long
With ws
startDate = .Cells(1, 2).Value2 'Required for input yyyy-mm-dd; Required for output yyyymmdd
endDate = .Cells(1, 5).Value2
startTime = .Cells(2, 2)
endTime = .Cells(2, 5)
End With
startIndex = Application.Match(startTime, hours) - 2
endIndex = Application.Match(endTime, hours) - 2
currentDate = startDate
Dim i As Long, s As Long, e As Long
Do While currentDate <= endDate
If startDate = endDate Then
s = startIndex
e = endIndex
Else
Select Case currentDate
Case startDate
s = startIndex
e = UBound(hours)
Case endDate
s = LBound(hours)
e = endIndex
Case Else
s = LBound(hours)
e = UBound(hours)
End Select
End If
For i = s To e
urls.Add PREFIX & Format$(currentDate, "yyyymmdd") & "-" & hours(i) & SUFFIX
Next
currentDate = DateAdd("d", 1, currentDate)
Loop
Set GetAllLinks = urls
End Function
Example output:
Related
What Discrete Optimization family is this?
I am given N lists of M items that will be physically realized (someone actually has to put items (names abbreviated here,) in physical bins.) Then the bins are emptied, if necessary, and re-used, working left-to-right. There is a real cost to putting a different item in a bin than what was in it before. I rearrange the lists to minimize changes, manually. Software can do it faster, and more reliably in an optimum way. The whole thing happens in Excel (then paper, then in a factory.) I wrote some VBA, a brute-force affair, that did really well with some examples. But not all. If I knew the family of optimization that this is, I could code it, even if I just pass something to a DLL. But multiple searches online have not succeeded. I tried several phrasings. It's not a traveling S.., knapsack, etc. It seems similar to the Sequence Alignment problem from Bioinformatics. Someone recognize it? Let's hear it, Operations Research people.
As it turns out, the naive solution just needed tweaking. Look at a cell. Try to find the same letter in the column to it's right. If you find one, swap it with whatever it to the right of that cell now. Work your way down. The ColumnsPer parameter accounts for the real-world use, where each column has an associated list of numbers and the grid columns alternate labels, numbers, labels, ... Option Explicit Public Const Row1 As Long = 4 Public Const ColumnsPer As Long = 1 '2, when RM, % Public Const BinCount As Long = 6 Public Const ColCount As Long = 6 Private Sub reorder_items_max_left_to_right_repeats(wksht As Worksheet, _ col1 As Long, maxBins As Long, maxRecipes As Long, ByVal direction As Integer) Dim here As Range Set here = wksht.Cells(Row1, col1) here.Activate Dim cond For cond = 1 To maxRecipes - 1 Do While WithinTheBox(here, col1, direction) If Not Adjacent(here, ColumnsPer).Value = here.Value Then Dim there As Range Set there = Matching_R_ange(here, direction) If Not there Is Nothing Then swapThem Adjacent(here, ColumnsPer), there End If NextItemDown: Set here = here.Offset(direction, 0) here.Activate 'Debug.Assert here.Address <> "$AZ$6" DoEvents Loop NextCond: Select Case direction Case 1 Set here = Cells(Row1, here.Column + ColumnsPer) Case -1 Set here = Cells(Row1 + maxBins - 1, here.Column + ColumnsPer) End Select here.Activate Next cond End Sub Function Adjacent(fromHereOnLeft As Range, colsRight As Long) As Range Set Adjacent = fromHereOnLeft.Offset(0, colsRight) End Function Function Matching_R_ange(fromHereOnLeft As Range, _ ByVal direction As Integer) As Range Dim rowStart As Long rowStart = Row1 Dim colLook As Long colLook = fromHereOnLeft.Offset(0, ColumnsPer).Column Dim c As Range Set c = Cells(rowStart, colLook) Dim col1 As Long col1 = c.Column Do While WithinTheBox(c, col1, direction) Debug.Print "C " & c.Address If c.Value = fromHereOnLeft.Value _ And c.Row <> fromHereOnLeft.Row Then Set Matching_R_ange = c Exit Function Else Set c = c.Offset(1 * direction, 0) End If DoEvents Loop 'returning NOTHING is expected, often End Function Function WithinTheBox(ByVal c As Range, ByVal col1 As Long, ByVal direction As Integer) Select Case direction Case 1 WithinTheBox = c.Row <= Row1 + BinCount - 1 And c.Row >= Row1 Case -1 WithinTheBox = c.Row <= Row1 + BinCount - 1 And c.Row > Row1 End Select WithinTheBox = WithinTheBox And _ c.Column >= col1 And c.Column < col1 + ColCount - 1 End Function Private Sub swapThem(range10 As Range, range20 As Range) 'Unlike with SUB 'Matching_R_ange', we have to swap the %s as well as the items 'So set temporary range vars to hold %s, to avoid confusion due to referencing items/r_anges If ColumnsPer = 2 Then Dim range11 As Range Set range11 = range10.Offset(0, 1) Dim range21 As Range Set range21 = range20.Offset(0, 1) 'sit on them for now End If Dim Stak As Object Set Stak = CreateObject("System.Collections.Stack") Stak.push (range10.Value) 'A Stak.push (range20.Value) 'BA range10.Value = Stak.pop 'A range20.Value = Stak.pop '_ Stak is empty now, can re-use If ColumnsPer = 2 Then Stak.push (range11.Value) Stak.push (range21.Value) range11.Value = Stak.pop range21.Value = Stak.pop End If End Sub
ACCESS sql code to multiply record with calculated value from previous cell
I am using Access Database to get a value. I am fairly new to access as I usually use SQLServer and I am having trouble in getting what I want. I have the following table, with column TARGET and incremental Target as the target column that I need to get: Category|Period|Value| TARGET | A | 4 | 1 | 1/1 =1 | A | 3 | 3 | 1/(3*1)=0.33 | (1/value at period 3 * previous target) A | 2 | 6 |1/(0.33*6)=0.505| A | 1 | 9 |1/(0.505*9)=0.22| The data is partitioned by Category and ordered in descending order by Period. For the first row the Target should be: (1/value at current period) For the next rows the Target should be: (1/value at current period * value of previous target) As you can see this is somehow complex as I need to evaluate a cell value and then for the next row I need to use the value in the cell above. Plus I need to get the incremental value for this column as well. Any help will be very much appreciated as I am new to Access and need to get this done soon!
Here is a function placed in general module that can be called from query. Value is a reserved word so I used Data for that field name. Option Compare Database Option Explicit Global dblTar As Double Global strCat As String ____________ Function CalcTarget(strC As String, dblT As Double) As Double If strCat <> strC Then strCat = strC dblTar = dblT End If dblTar = 1 / (dblT * dblTar) CalcTarget = dblTar End Function Calling function in query: SELECT Category, Period, Data, CalcTarget([Category],[Data]) AS Target FROM Table1; Normally I advise not to save calculated data to table when a query can work, but if you prefer to save, then options are: An UPDATE action: UPDATE Table1 SET Target = CalcTarget([Category],[Data]); Or VBA: Sub CalcTarget() Dim rs As DAO.Recordset Dim strCat As String Dim dblTar As Double Set rs = CurrentDb.OpenRecordset("SELECT * FROM table1 ORDER BY Category, Period DESC") strCat = rs!Category dblTar = rs!Data Do While Not rs.EOF If rs!Category = strCat Then dblTar = 1 / (rs!Data * dblTar) rs.Edit rs!Target = dblTar rs.Update rs.MoveNext Else strCat = rs!Category dblTar = rs!Data End If Loop End Sub
vb.net + mysql - Search table for top 5 rows that are the most similar to input values
I have a Database with many columns, one of them containing Names. My vb.net software acts as telegram server and waits for the user to send its full name. The database could have its name spelled differently, for example "Marco Dell'Orso" could be spelled "Marco Dellorso" or "Marco Dell Orso" od "Dell Orso Marco" or whatever. The user could also misspell his name and invert two letters.. for esample "MaCRo Dell'Orso" I would need a way to return the 5 rows that are the most similar to the words used in the query. What would be the best way? I was thinking of splitting the name on whitechars and then use LIKE in the query with the single words, but that does not work with mistyped words. EDIT: My current plan is to that if the database contains more than one or less then one rows with the exact name, then split the input into the single words and return all strings that contain ANY of the input words. this should reduce the rows to analyze from 42000 to a few hundred. Once I have these few hundred lines, i could run a Levenshtein function on the rows and return the 5 most matching.. Is this a good idea?
Solved it this way by combining my custom function with a premade Levenshtein function from this link: How to calculate distance similarity measure of given 2 strings? . I assign a score for each single word that appears in the other wordcomplex. then I add a score based on the Levenshtein comparison of each word to another. works great: Public Class Form1 Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp calc() End Sub Private Sub TextBox2_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox2.KeyUp calc() End Sub Sub calc() Label1.Text = compare(TextBox1.Text, TextBox2.Text) End Sub Public Function compare(source As String, target As String) As Integer Dim score As Double Dim sourcewords As String() = source.Split(New Char() {" "c, "'"c, "`"c, "´"c}) Dim targetwords As String() = target.Split(New Char() {" "c, "'"c, "`"c, "´"c}) For Each s In sourcewords If target.Contains(s) Then score = score + 1 For Each t In targetwords score = score + 1 / (DamerauLevenshteinDistance(s, t, 100) + 1) Next Next For Each s In targetwords If source.Contains(s) Then score = score + 1 For Each t In sourcewords score = score + 1 / (DamerauLevenshteinDistance(s, t, 100) + 1) Next Next Return score End Function ''' <summary> ''' Computes the Damerau-Levenshtein Distance between two strings, represented as arrays of ''' integers, where each integer represents the code point of a character in the source string. ''' Includes an optional threshhold which can be used to indicate the maximum allowable distance. ''' </summary> ''' <param name="source">An array of the code points of the first string</param> ''' <param name="target">An array of the code points of the second string</param> ''' <param name="threshold">Maximum allowable distance</param> ''' <returns>Int.MaxValue if threshhold exceeded; otherwise the Damerau-Leveshteim distance between the strings</returns> Public Shared Function DamerauLevenshteinDistance(source As String, target As String, threshold As Integer) As Integer Dim length1 As Integer = source.Length Dim length2 As Integer = target.Length ' Return trivial case - difference in string lengths exceeds threshhold If Math.Abs(length1 - length2) > threshold Then Return Integer.MaxValue End If ' Ensure arrays [i] / length1 use shorter length If length1 > length2 Then Swap(target, source) Swap(length1, length2) End If Dim maxi As Integer = length1 Dim maxj As Integer = length2 Dim dCurrent As Integer() = New Integer(maxi) {} Dim dMinus1 As Integer() = New Integer(maxi) {} Dim dMinus2 As Integer() = New Integer(maxi) {} Dim dSwap As Integer() For i As Integer = 0 To maxi dCurrent(i) = i Next Dim jm1 As Integer = 0, im1 As Integer = 0, im2 As Integer = -1 For j As Integer = 1 To maxj ' Rotate dSwap = dMinus2 dMinus2 = dMinus1 dMinus1 = dCurrent dCurrent = dSwap ' Initialize Dim minDistance As Integer = Integer.MaxValue dCurrent(0) = j im1 = 0 im2 = -1 For i As Integer = 1 To maxi Dim cost As Integer = If(source(im1) = target(jm1), 0, 1) Dim del As Integer = dCurrent(im1) + 1 Dim ins As Integer = dMinus1(i) + 1 Dim [sub] As Integer = dMinus1(im1) + cost 'Fastest execution for min value of 3 integers Dim min As Integer = If((del > ins), (If(ins > [sub], [sub], ins)), (If(del > [sub], [sub], del))) If i > 1 AndAlso j > 1 AndAlso source(im2) = target(jm1) AndAlso source(im1) = target(j - 2) Then min = Math.Min(min, dMinus2(im2) + cost) End If dCurrent(i) = min If min < minDistance Then minDistance = min End If im1 += 1 im2 += 1 Next jm1 += 1 If minDistance > threshold Then Return Integer.MaxValue - 1 End If Next Dim result As Integer = dCurrent(maxi) Return If((result > threshold), Integer.MaxValue, result) End Function Private Shared Sub Swap(Of T)(ByRef arg1 As T, ByRef arg2 As T) Dim temp As T = arg1 arg1 = arg2 arg2 = temp End Sub End Class
One way is to use the build-in soundex function of MySQL. SELECT SOUNDEX(name) FROM table; Or, the better way, there are a few MySQL-functions on the web implementing DoubleMetaphone. I think this is what you are searching: GitHub
Value used in formula of wrong data type
I have been trying to figure this error out for the past few days with no luck. I am hoping one of you would be able to help. I am getting "value used in formula of wrong data type. Quick explanation: convert functions like this one to its corresponding text (20054/18393)*100.0 the 5 digit numbers are Field IDs that refer to questions. ID Question 20054 How many days of year do you work 18393 How many days of vacation do you get a year The result I am trying to get to is (How many days of year do you work / How many days of vacation do you get a year) *100.0 It could be easily done manually if it was just a hand full. I have over 2600 formulas that need to be converted. I created this function below which is resulting in the error mentioned in the title. Any assistance would be greatly appreciated Here is my function Function Test(sInput As String) As String Dim i As Long Dim num As String Dim Text, a, str, shortname As String For i = 1 To Len(sInput) a = Mid(sInput, i, 1) If IsNumeric(a) Then num = num & a Text = "" Else If a = "." Then num = num & a Else 'search for num value in second sheet short name shortname = WorksheetFunction.VLookup(WorksheetFunction.Int(num), Worksheets("questionlist").Range("A3:F2537"), 5, False) num = "" End If Text = shortname & a shortname = "" End If str = str & Text Next Test = str End Function
The error is raised because you are passing blank value to INT Function in the line WorksheetFunction.VLookup(WorksheetFunction.Int(num), Worksheets("questionlist").Range("A3:F2537"), 5, False) To reproduce the error Type =INT("") in any cell To fix this handle blank values Updated Answer: Function Formula2Text(ByRef myCell As Range) As String Dim QuestionId As Integer Dim strInput As String 'Get Formula instead of values strInput = myCell.FormulaR1C1 'Use Regex to Catch all ID's Set Regex = CreateObject("VBScript.RegExp") Set rnglookup = Worksheets("questionlist").Range("A3:F2537") Regex.Global = True Regex.Pattern = "\d+" For Each Match In Regex.Execute(strInput) 'Skip if the ID is 100 If (Match.Value <> 100) Then QuestionId = Match.Value 'Lookup ID in the rnglookup,Make sure the Ids are sorted in asc in the questionlist sheet Qntxt = Application.VLookup(QuestionId, rnglookup, 5, False) If IsError(Qntxt) Then Qntxt = "Missing Lookup" 'Replace the ID with the lookup strInput = Replace(strInput, QuestionId, Qntxt) Next Formula2Text = strInput End Function Usage:In the cell next to the formula use the function by referencing the formula =Formula2Text(A1)
how to import 2 row from txt or EXCEL into the same ROW in SQL server
I need to extract information from a page I have access to. Just in this module, I have no means to export, but just to copying and pasting the information Looks like this in the same l 1. MANUF MODEL YEAR MFG SERIAL REG OS DATE DESC LISTED1. YEAR DLV 2. monster 4r25 1988 23547248 Waka001 7/23/2012 For sale 7/22/20092. 1989 3. FORD 12SE 1994 6262552 DBZRLZ0 7/26/2012 For sale 7/9/20093. 1994 I'm getting my data in rows, but the year mfg and year dlv is in 2 rows within one row (or 2 rows in the same field). When pasted on excel it makes 2 rows first with all the data in the row including year mng and a second row just for year dlv (in the same column). I can parse this information in excel by adding extra column and coping that extra field and deleting blanks and so on. But I want to omit the excel part and import this from a TXT file which when pasted creates 2 rows per row as well and using tabs as delimiter (like txt text tab delimited). When I import with bulk insert, it imports twice as much rows, but I can't imagine a way to parse this second row into a new column. Can someone help with this? In t-sql (every row has only one row of info, but in the column year mfg /year dlv, comes with two rows). Or point me on what to read or which would be a better approach? Maybe importing 2 rows at once ETC.
You can import the data set from the text file into a temp table including the blank lines. This will give you a data set in SQL with 2 types of records. 1. Records that have all data except delivery date. 2. Records that have only delivery dates and no other fields. (Add a unique auto increment key) Because the related records will be one record apart, Record N and Records N+1 are actually the same record. Then a select query Joining the temp table to its self by RecID = RecId+1 will give a complete record with all fields SELECT * FROM tmpTable AS MainRecord INNER JOIN tmpTable AS MissingField ON MainRecord.RecId = MissingField.RecId +1 From this dataset you can instert into your main data.
Do you know how to use VBA? You can run this code (FixData()) in Excel before you use it in TSQL so it fixes the extra row problem. Hope this helps Option Explicit Public Sub FixData() Dim ws As Excel.Worksheet Dim iCurRow As Long Dim iLastRow As Long Set ws = ActiveSheet iLastRow = ws.Cells.SpecialCells(xlCellTypeLastCell).Row ' move through the spreadsheet from bottom to top For iCurRow = iLastRow To 1 Step -1 If (isCurrentRowMessedUp(ws, iCurRow) = True) Then Call AppendDataToPreviousRow(ws, iCurRow) ' delete the row since we moved the data out of there ws.Rows(iCurRow).EntireRow.Delete End If Next End Sub Private Sub AppendDataToPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) Dim firstCellInRow As Excel.Range Dim lastCellInRow As Excel.Range Dim previousRowRangeToPaste As Excel.Range ' check if first column has data in it, otherwise find the first column that has data If (ws.Cells(currentRow, 1).Value = vbNullString) Then Set firstCellInRow = ws.Cells(currentRow, 1).End(xlToRight) Else Set firstCellInRow = ws.Cells(currentRow, 1) End If Set lastCellInRow = ws.Cells(currentRow, ws.Columns.Count).End(xlToLeft) Set previousRowRangeToPaste = ws.Cells(currentRow - 1, getNextColumnAvailableInPreviousRow(ws, currentRow)) ws.Range(firstCellInRow, lastCellInRow).Cut previousRowRangeToPaste End Sub Private Function isCurrentRowMessedUp(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Boolean Dim cellCountInRow As Long Dim firstCellInRow As Excel.Range Dim lastCellInRow As Excel.Range Set firstCellInRow = ws.Cells(currentRow, 1) Set lastCellInRow = ws.Cells(currentRow, ws.Columns.Count).End(xlToLeft) cellCountInRow = Application.WorksheetFunction.CountA(ws.Range(firstCellInRow, lastCellInRow)) If (cellCountInRow <= 1) Then isCurrentRowMessedUp = True Else isCurrentRowMessedUp = False End If End Function Private Function getLastColumnInPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Long Dim rng As Excel.Range Set rng = ws.Cells(currentRow - 1, 1).End(xlToRight) getLastColumnInPreviousRow = rng.Column End Function Private Function getNextColumnAvailableInPreviousRow(ByRef ws As Excel.Worksheet, ByVal currentRow As Long) As Long getNextColumnAvailableInPreviousRow = getLastColumnInPreviousRow(ws, currentRow) + 1 End Function
you can use SQL Server Integration Service (SSIS) for convert data from any source data such as excel to any destination data such as SQL Server