First off, I'm completely new to VBA and want to build a weather statistic in Excel. Help is much appreciated!
For this I need data from multiple websites which include the information of the messured temperature, the weather station and the time.
With some VBA information I found on the internet I have so far written a code, which delivers me the needed information for one website in the immediate window only.
Which looks like this:
7.4°C | Wien-Mariabrunn (225m) | 14:00
7.6°C | Wien-Hohe Warte (198m) | 14:00
7.6°C | Wien-Unterlaa (200m) | 14:00
7.7°C | Wien-Schwechat (183m) | 14:00
7.8°C | Wien-Donaufeld (160m) | 14:00
8.1°C | Grossenzersdorf (154m) | 14:00
8.2°C | Wien-City (177m) | 14:00
Dim xmlReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Temps1 As MSHTML.IHTMLElementCollection
Dim temps2 As MSHTML.IHTMLElementCollection
Dim Temp As MSHTML.IHTMLElement
xmlReq.Open "GET", "https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html", False
xmlReq.send
If xmlReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & xmlReq.Status & " - " & xmlReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = xmlReq.responseText
Set Temps1 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-5")
Set temps2 = HTMLDoc.getElementsByClassName("ap o o-1 o-tmp-1")
For Each Temp In Temps1
Debug.Print Temp.Title
Next Temp
For Each Temp In temps2
Debug.Print Temp.Title
Next Temp
First problem I have, is that I have no idea how to put this information into a sheet or cells.
The second problem is, that I need the same data from multiple websites from a startdate to enddate.
The website I used in this example-code is https://kachelmannwetter.com/at/messwerte/wien/temperatur/20190101-1300z.html. At the end you can find the date, in this example "20190101" & the time "1300".
So for this I need some kind of loop.
So at the end I need a worksheet with column A with the date, column B with the time and column C with the data (for each date & time).
I hope this was somehow understandable and I am very thankful for any help.
The following works nicely for short date ranges. Longer date ranges lead to slower response. It is likely website is blocking/throttling. To this end:
I include a variable pauseIndex which means that every x (=pauseIndex) number of urls, a delay of y seconds (as specified by waitSeconds) before the next request is added. You can play with this.
You might consider adapting code to run in daily/monthly batches and either append to bottom of existing dataset or write to a new sheet using the current day/month as title.
Perhaps rotate/change IPs and use MSXML2.ServerXMLHTTP.
Tinker with the above to get the optimal settings and batch request size (if doing batch).
Specifying start datetime and end datetimes:
The date ranges are specified in a sheet called Date ranges. It has the following set-up:
Building a dataset:
I would recommend building a flat table dataset where you specify the exact stations to return info for. Not each station appears for every timedate.
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City")
You can expand this. I include a dictionary variable newStations which stores all stations encountered that are not in your list for monitoring. You can easily write these out to help decide on additional stations to monitor/include in your dataset.
Placeholder values for missing station readings are used to ensure a complete dataset.
You may wish to normalise “outliers” - for example, actual hour values can lie within the range rather than on the hour. In the demo below 16:20 is retrieved accurately for one station. You could normalise this to 16:00.
Helper functions/Sub:
There are a number of helper functions, and 1 sub, used in the code.
GetAllLinks. Generates all request urls between start date time and end date time. See notes within code. These can be looped to issue each request for data
EmptyDict - ensure that station data is cleared out between requests
UpdateDictForNoReading. Handles the case where a monitored station is not reported for specified datetime. It updates temperature and long station description with "No reading"
WriteOutResults. Produces a "flat" i.e. not nested, 2D array structure and writes results to specified output sheet
Retrieving stations and station data:
I use a css attribute = value selector, with contains operator, to target the station data.
Taking a example station's HTML
<a class="ap o o-1 o-tmp--1" data-target="#obs-detail-3h" data-toggle="modal" data-left="635" data-top="545" onclick="obs_detail_3h('-1.0°C', 'Wien-Schwechat (183m)', '16:20','110360', '201901031500');" title="-1.0°C | Wien-Schwechat (183m) | 16:20" style="left: 408.533px; top: 337.757px;">-1</a>
If we look at the class attribute we see that is it as follows:
class="ap o o-1 o-tmp--1"
The value of the class attribute is "ap o o-1 o-tmp--1", which is in fact a series of classes separated by spaces. Each station class value has the same substring which is o-tmp. You could vary this slightly. I use querySelectorAll to return a nodeList of all elements which have this substring in the class attribute value.
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
This matches all stations on the page (map).
The title attribute of each node in the nodeList (mapStations) contains the data of interest:
title="-1.0°C | Wien-Schwechat (183m) | 16:20"
The string contains pipe (|) delimiters. I can use split() to generate an array containing each bit of info:
arr = Split(mapStations.item(i).Title, " | ")
That generates an array which will have -1.0°C , Wien-Schwechat (183m) and 16:20 at different indices. Using this example, I store -1.0°C in variable temp, Wien-Schwechat (183m) in variable stationFull, just the station name Wien-Schwechatin station, 16:20 in time.
TODO:
Refactor to reduce level of nesting
Error handling for cases such as status code <> 200.....
Move variable declarations closer to their usage
Requirements:
VBE > Tools > References > Add reference to Microsoft HTML Object Library
Worksheet called Date ranges
Worksheet called Output
Data in Date ranges should be laid out as shown in image above.
VBA:
Option Explicit
Public Sub GetInfo()
'VBE > Tools > References > Microsoft HTML Object Library
'Collect hourly temperature readings from list of stations in array stations.
'Missing readings are populated with "Missing reading". Times specified in request are not necessarily identical _
'to that correctly returned from page as reading reported within an hour interval may not be on the hour
Dim html As HTMLDocument, i As Long, arr() As String, mapStations As Object, dict As Object, newStations As Object
Dim time As String, station As String, temp As String, stations(), results(), j As Long
Dim urls As Object, url As Variant, startOfDateString As Long, currDate As String, stationFull As String
Dim outputSht As Worksheet, x As Long
Const pauseIndex As Long = 20
Const waitSeconds As Long = 1
Const PREFIX As String = "https://kachelmannwetter.com/at/messwerte/wien/temperatur/"
Const SUFFIX As String = "z.html"
startOfDateString = InStrRev(PREFIX, "/") + 1
Set outputSht = ThisWorkbook.Worksheets("Output")
Set urls = GetAllLinks(PREFIX, SUFFIX)
Set html = New HTMLDocument
Set dict = CreateObject("Scripting.Dictionary")
Set newStations = CreateObject("Scripting.Dictionary")
stations = Array("Wien-Schwechat", "Wien-Unterlaa", "Wien-Mariabrunn", "Wien-Hohe Warte", "Grossenzersdorf", _
"Wien-Donaufeld", "Wien-City") 'order of stations here should match that in sheet
j = 1
For i = LBound(stations) To UBound(stations)
dict(stations(i)) = vbNullString
Next
ReDim results(1 To 1 * urls.Count)
With CreateObject("MSXML2.XMLHTTP")
For Each url In urls
x = x + 1
If x Mod pauseIndex = 0 Then Application.Wait Now + TimeSerial(0, 0, waitSeconds)
DoEvents
.Open "GET", url, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
Set mapStations = html.querySelectorAll("[class*='o-tmp']")
For i = 0 To mapStations.Length - 1
arr = Split(mapStations.item(i).Title, " | ")
currDate = Join(Array(Mid$(url, startOfDateString + 4, 2), Mid$(url, startOfDateString + 6, 2), Mid$(url, startOfDateString, 4)), "-")
temp = arr(0)
station = Split(arr(1), " (")(0)
stationFull = arr(1)
time = arr(2)
If Not dict.Exists(station) Then
newStations(station) = vbNullString 'in case you are interested in which stations exist you are not monitoring
Else
dict(station) = Array(currDate, time, station, stationFull, temp)
End If
Next
Set dict = UpdateDictForNoReading(dict, currDate, time)
results(j) = dict.items
j = j + 1
Set dict = EmptyDict(dict)
Next
End With
WriteOutResults outputSht, results, UBound(stations) + 1
End Sub
Public Function UpdateDictForNoReading(ByVal dict As Object, ByVal currDate As String, ByVal time As String) As Object
'Loop dictionary containing station readings. If current value for key is not an array then no readings where found. _
'then dict is updated with "No reading" for station long text (which includes m e.g. Wien-Schwechat (183m)) and temperature
Dim key As Variant
For Each key In dict
If Not IsArray(dict(key)) Then dict(key) = Array(currDate, time, key, "No reading", "No reading")
Next
Set UpdateDictForNoReading = dict
End Function
Public Sub WriteOutResults(ByVal ws As Worksheet, ByRef results As Variant, ByVal stationCount As Long)
'Loop results array which at each index should have a child array which is comprised of all stations specified _
'The code unravels the nested structure into "flat" array for writing out to sheet. Aim is to be more efficient _
'with writing out to sheet
'The sheet to write results to is passed as argument ws. Headers are stated below.
Dim headers(), outputArr(), i As Long, arr(), j As Long, r As Long, c As Long
headers = Array("Date", "Time", "Station", "StationFull", "Temp")
ReDim outputArr(1 To UBound(results) * stationCount, 1 To UBound(headers) + 1)
For i = LBound(results) To UBound(results)
arr = results(i) '0-6
For j = LBound(arr) To UBound(arr)
r = r + 1
If IsArray(arr(j)) Then
For c = LBound(arr(j)) To UBound(arr(j))
outputArr(r, c + 1) = arr(j)(c)
Next
End If
Next
Next
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(outputArr, 1), UBound(outputArr, 2)) = outputArr
End With
End Sub
Public Function EmptyDict(ByVal dict As Object) As Object
'Ensures no data remains from prior request
Dim key As Variant
For Each key In dict
dict(key) = vbNullString
Next
Set EmptyDict = dict
End Function
Public Function GetAllLinks(ByVal PREFIX As String, ByVal SUFFIX As String) As Object
'Generate all urls between start date time and end date time. Accounts for fact that there is no real midnight. 00:00 uhr retrieves the 23:00 of prior day.
'Times selected on page e.g. 1:00 uhr are one hour ahead of what is used in url string e.g. 1 > 0.
Dim ws As Worksheet, hours(), urls As Collection
Set urls = New Collection
Set ws = ThisWorkbook.Worksheets("Date ranges")
'url "0000" = 1am. The selection of 00:00 in the sheet gives 23:00 of prior date
hours = Array("0000", "0100", "0200", "0300", "0400", "0500", "0600", "0700", "0800", "0900", "1000", "1100", "1200", _
"1300", "1400", "1500", "1600", "1700", "1800", "1900", "2000", "2100", "2200", "2300")
Dim startDate As Date, endDate As Date, startTime As String, endTime As String, currentDate As Date
Dim endIndex As Long, startIndex As Long
With ws
startDate = .Cells(1, 2).Value2 'Required for input yyyy-mm-dd; Required for output yyyymmdd
endDate = .Cells(1, 5).Value2
startTime = .Cells(2, 2)
endTime = .Cells(2, 5)
End With
startIndex = Application.Match(startTime, hours) - 2
endIndex = Application.Match(endTime, hours) - 2
currentDate = startDate
Dim i As Long, s As Long, e As Long
Do While currentDate <= endDate
If startDate = endDate Then
s = startIndex
e = endIndex
Else
Select Case currentDate
Case startDate
s = startIndex
e = UBound(hours)
Case endDate
s = LBound(hours)
e = endIndex
Case Else
s = LBound(hours)
e = UBound(hours)
End Select
End If
For i = s To e
urls.Add PREFIX & Format$(currentDate, "yyyymmdd") & "-" & hours(i) & SUFFIX
Next
currentDate = DateAdd("d", 1, currentDate)
Loop
Set GetAllLinks = urls
End Function
Example output:
To make my life much easier working with my data I set 3 columns of same date content. The first displays in the format mm/dd/yyyy, second in the format of yyyy-mm and the third in the format yyyy-q.
I did it purposely due to my reports. Sometimes I need to create monthly, quarterly, yearly etc. Usually I work with a form where I invite the user select start and end date and by a click of a button run a report. This report extracts a query where I specify on the date section to pull all information between start and end date. This time I want to do the same procedure but instead of start and end date - I want the user to select which quarter he wants so that the query will pull all information regarding this quarter. What do I specify in the criteria to archive this?
Filter on
DatePart("q", [YourDateField])
or
Format([YourDateField], "yyyyq")
To obtain the first and last date of a quarter, given the year and the quarter, you can use these expressions:
DateQuarterFirst = DateSerial(Y, 1 + 3 * (Q - 1), 1)
DateQuarterLast = DateSerial(Y, 1 + 3 * Q, 0)
If you have a date of the quarter, you can these functions to obtain the first and last date of the quarter of that date:
Public Function DateThisQuarterFirst( _
Optional ByVal datDateThisQuarter As Date) As Date
Const cintQuarterMonthCount As Integer = 3
Dim intThisMonth As Integer
If datDateThisQuarter = 0 Then
datDateThisQuarter = Date
End If
intThisMonth = (DatePart("q", datDateThisQuarter) - 1) * cintQuarterMonthCount
DateThisQuarterFirst = DateSerial(Year(datDateThisQuarter), intThisMonth + 1, 1)
End Function
Public Function DateThisQuarterLast( _
Optional ByVal datDateThisQuarter As Date) As Date
Const cintQuarterMonthCount As Integer = 3
Dim intThisMonth As Integer
If datDateThisQuarter = 0 Then
datDateThisQuarter = Date
End If
intThisMonth = DatePart("q", datDateThisQuarter) * cintQuarterMonthCount
DateThisQuarterLast = DateSerial(Year(datDateThisQuarter), intThisMonth + 1, 0)
End Function
I am using the following code in Access 2010. I use it in an unbound text box to return weekdays between two schedule dates (start/Finish) on various scheduled tasks on a form. Code is working properly when dates are entered, however on this particular form not every task will have start / end dates. I would like the code to just return "" or 0 if inputs are blank.
I should note I did not write this code myself, I am very very new to VBA and found this code online and manipulated it slightly to work for my application. How can I modify this to fit my needs?
Public Function Weekdays( ByRef startDate As Date, _
ByRef endDate As Date _
) As Integer
' Returns the number of weekdays in the period from startDate
' to endDate inclusive. Returns -1 if an error occurs.
' If your weekend days do not include Saturday and Sunday and
' do not total two per week in number, this function will
' require modification.
On Error GoTo Weekdays_Error
' The number of weekend days per week.
Const ncNumberOfWeekendDays As Integer = 2
' The number of days inclusive.
Dim varDays As Variant
' The number of weekend days.
Dim varWeekendDays As Variant
' Temporary storage for datetime.
Dim dtmX As Date
' Calculate the number of days inclusive (+ 1 is to add back startDate).
varDays = DateDiff(Interval:="d", _
date1:=startDate, _
date2:=endDate) + 1
' Calculate the number of weekend days.
varWeekendDays = (DateDiff(Interval:="ww", _
date1:=startDate, _
date2:=endDate) _
* ncNumberOfWeekendDays) _
+ IIf(DatePart(Interval:="w", _
Date:=startDate) = vbSunday, 1, 0) _
+ IIf(DatePart(Interval:="w", _
Date:=endDate) = vbSaturday, 1, 0)
' Calculate the number of weekdays.
Weekdays = (varDays - varWeekendDays)
Weekdays_Exit:
Exit Function
Weekdays_Error:
Weekdays = -1
Resume Weekdays_Exit
End Function
Your code will have to accept a Null value, since Date is a Data type that will not tolerate Null, you have two methods, change the declaration of the function from.
Public Function Weekdays( ByRef startDate As Date, _
ByRef endDate As Date _
) As Integer
To,
Public Function Weekdays(startDate, endDate) As Integer
This way the code can have Null values, so a few more additions could be made as,
Public Function Weekdays(startDate, endDate) As Integer
' Returns the number of weekdays in the period from startDate
' to endDate inclusive. Returns -1 if an error occurs.
' If your weekend days do not include Saturday and Sunday and
' do not total two per week in number, this function will
' require modification.
On Error GoTo Weekdays_Error
If IsNull(startDate) Or IsNull(endDate) Then
Weekdays = 0
Exit Function
End If
Const ncNumberOfWeekendDays As Integer = 2
'so on....
Or the other way is to make sure you pass dates by employing Nz() or even prevent the function to be called if you have Null values.
I'm trying to find some VBA code to determine the number of week days and weekend days in a given date range using Access VBA.
For example:
Begin Date - 1/1/2012
End Date - 1/31/2012
Result should be:
Week days - 22
Weekend days - 9
Can anyone help out with this?
These two functions will calculate the number of weekdays and weekend days:
Function NumWeekendDays(dBegin As Date, dEnd As Date) As Long
Dim iPartial As Integer
Dim lBeginDay As Long
Dim lNumWeekendDays As Long
iPartial = DateDiff("d", dBegin, dEnd + 1) Mod 7
lBeginDay = 6 - DatePart("w", dBegin, vbMonday)
lNumWeekendDays = (DateDiff("d", dBegin, dEnd + 1) \ 7) * 2
If iPartial > 0 And lBeginDay - iPartial < 0 Then
If lBeginDay = -1 Then
lNumWeekendDays = lNumWeekendDays + 1
ElseIf iPartial - lBeginDay = 1 Then
lNumWeekendDays = lNumWeekendDays + 1
Else
lNumWeekendDays = lNumWeekendDays + 2
End If
End If
NumWeekendDays = lNumWeekendDays
End Function
Function NumWeekDays(dBegin As Date, dEnd As Date) As Long
NumWeekDays = DateDiff("d", dBegin, dEnd + 1) - NumWeekendDays(dBegin, dEnd)
End Function
Note: I found it simplest to calculate the partial-week weekend days by calculating the lBeginDay variable so that if the start date was Monday, lBeginDay == 5... if the start date was Friday, lBeginDay == 1, etc. Other variations should also work.
I have an Access Form - lets call it "Add Labor" (Access 2007) that saves data into a table.
The table has two columns in particular called "Start Date" and "End Date" (This table stores tasks)
There is also another table called FiscalYears which includes Start and End Dates for Fiscal Years, which is structured as follows
FyID
FYear
StartDate
EndDate
Example Data:
FYId FYear StartDate EndDate
-----------------------------
1 2010 10/1/2009 9/30/2010
2 2011 10/1/2010 9/30/2011
So in My Add Labor Form if someone enters labor that span across two fiscal years I need to enter two labor entries. Here is an example
If a user selects Labor Start Date = 6/30/2009
And End Date 10/2/2010 , it spans two fiscal years
So in my Labor Table I should enter two things
LaborID StartDate EndDate
-----------------------------
1 6/30/2009 9/30/2010
2 10/1/2010 10/2/2010
Basically I need to do a check before I save the record and add two records if they span Fiscal years, right now I'm just blindly doing Save Record on the form (inbuilt), but I guess I need to add some VBA. I've hardly ever used Access so this may be simple(hopefully). I am thinking instead of the event which just calls Save Record, I need it to add custom VBA.
Say you have an unbound form for adding the dates, you can say:
Dim rsFY As DAO.Recordset
Dim rsAL As DAO.Recordset
Dim db As Database
Dim sSQL As String
Set db = CurrentDb
''Select all years from the fiscal years table
sSQL = "SELECT FYear, StartDate, EndDate " _
& "FROM FiscalYears WHERE StartDate>=#" & Format(Me.StartDate, "yyyy/mm/dd") _
& "# Or EndDate <=#" & Format(Me.Enddate, "yyyy/mm/dd") _
& "# ORDER BY FYear"
Set rsFY = db.OpenRecordset(sSQL)
Set rsAL = db.OpenRecordset("AddLabor") ''table
''Populate recordset
rsFY.MoveLast
rsFY.MoveFirst
Do While Not rsFY.EOF
''Add records for each year selected
rsAL.AddNew
If rsFY.AbsolutePosition = 0 Then
rsAL!StartDate = Format(Me.StartDate, "yyyy/mm/dd")
Else
rsAL!StartDate = rsFY!StartDate
End If
If rsFY.AbsolutePosition + 1 = rsFY.RecordCount Then
rsAL!Enddate = Format(Me.Enddate, "yyyy/mm/dd")
Else
rsAL!Enddate = rsFY!Enddate
End If
rsAL.Update
rsFY.MoveNext
Loop
If the code was running in a main form with a subform showing the Addlabor table, you could update the subform to show the new records like so:
Me.Addlabor_subform.Requery
Why do you need a FiscalYears table? If your organization's fiscal years always start on Oct. 1 and end on Sept. 30, you can use a function to determine the fiscal year for a given date.
Public Function Fy(ByVal pDate As Date) As Integer
Dim intYear As Integer
Dim intReturn As Integer
intYear = Year(pDate)
If pDate > DateSerial(intYear, 9, 30) Then
intReturn = intYear + 1
Else
intReturn = intYear
End If
Fy = intReturn
End Function
And simple functions to return the Start and End dates for a given year.
Public Function FyStart(ByVal pYear As Integer) As Date
FyStart = DateSerial(pYear - 1, 10, 1)
End Function
Public Function FyEnd(ByVal pYear As Integer) As Date
FyEnd = DateSerial(pYear, 9, 30)
End Function
You can then determine how many fiscal years are included in a given date range by:
Fy(EndDate) - Fy(StartDate)
But I may be totally off base because you said "Start Date = 6/30/2009 And End Date 10/2/2010" spans two years. However, this expression returns 2 (3 years):
Fy(#10/2/2010#) - Fy(#6/30/2009#)