I was hoping someone could help me figure out why this script will not return the link names. I am trying to return a sub-string from 'http://textfiles.com/directory.html' which just writes the link names to the console, but I am struggling. The main problem - as far as I can see - is in the 'do until' loop. The working code outputs the html text to the console more for my sake than anything else (it does this successfully), but this feature may also help you guys understand the total picture I am facing. Maybe after seeing the code/ understanding my goal you guys can see where I am going wrong AND/OR suggest a better method for achieving my goal. Thanks a ton!
Imports System.IO
Imports System.Text
Module Module1
Sub Main()
Dim line As String = ""
Dim lowBound As String = "<a href="""
Dim highBound As String = """>"
Console.WriteLine("Grab link names from textfiles.com")
Console.WriteLine("")
Dim siteName As String = "http://textfiles.com/directory.html"
Dim tmpString As StringBuilder = New StringBuilder
My.Computer.Network.DownloadFile(siteName, "C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html", False, 500)
Dim myReader As StreamReader = New StreamReader("C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html")
While Not IsNothing(line)
line = myReader.ReadLine()
If Not IsNothing(line) Then
tmpString.Append(line)
End If
End While
Dim pageText As String = tmpString.ToString
Console.WriteLine(pageText)
Dim intCounter As Integer = 1
Do Until intCounter >= Len(pageText)
Dim checkSub As String = Mid(pageText, intCounter + 1, (Len(pageText) - intCounter))
Dim positLow As Integer = InStr(checkSub, lowBound)
Dim positHigh As Integer = InStr(checkSub, highBound)
If (positLow > 0 And positHigh > 0) And positLow < positHigh Then
Dim indexLow As Integer = checkSub.IndexOf(lowBound)
Dim indexHigh As Integer = checkSub.IndexOf(highBound)
Dim foundLink As String = checkSub.Substring(indexLow + Len(lowBound), indexHigh - Len(highBound))
Console.WriteLine(foundLink)
intCounter = intCounter + (Len(lowBound) + Len(highBound) + Len(foundLink) - 1)
Else
intCounter = Len(pageText)
End If
Loop
Console.ReadLine()
myReader.Close()
My.Computer.FileSystem.DeleteFile("C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html")
End Sub
End Module
Related
I have tried many proven methods from various posts to get some data from a web page without success. I am able to get a list of linked items on the opening page but once I navigate to any other page, I draw a blank with the code below.
When I run the code, I get no results in Cats.
Sub Main()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Cats As MSHTML.IHTMLElementCollection
Dim Cat As MSHTML.IHTMLElement
Dim NextHref As String
Dim NextURL As String
XMLReq.Open "GET", URL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem"
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Set XMLReq = Nothing
Set Cats = HTMLDoc.getElementsByClassName("ng-tns-c329-5 product-grid--tile ng-star-inserted")
Debug.Print Cats.Length 'Returns 0
'For Each Cat In Cats
' NextHref = Cat.getAttribute("href")
' NextURL = URL & Mid(NextHref, InStr(NextHref, ":") + 2)
' ListItemsInCats Cat.innerText, NextURL
'Next Cat
End Sub
Expanded Element structure
Collased structure
Thanks for any assistance.
The problem with the website you are trying to scrape from is that:
In XMLHTTP Request method - The product details are dynamic content that is pulled from Fetch/XHR which XMLHTTP does not run, XMLHTTP only gives you the HTML document as it is without any script running.
In Internet Explorer method - The webpage is considered ready before the product details are actually loaded so the usual loop check for Busy and ReadyState is not sufficient.
The code below uses Internet Explorer and to resolve the issue mentioned above, I have put up some checks (Which is not perfect I believe but it works so far in my testing) that will wait until the first product has been loaded before proceeding to pull the product details:
Private Sub GetBakeryProducts()
Const URL As String = "https://www.woolworths.com.au/shop/browse/bakery"
Dim ieObj As InternetExplorer
Set ieObj = New InternetExplorer
ieObj.navigate URL
ieObj.Visible = True
Do While ieObj.Busy Or ieObj.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Do While ieObj.document.getElementsByClassName("productCarousel-header").Length = 0
DoEvents
Loop
Dim ieDoc As MSHTML.HTMLDocument
Set ieDoc = ieObj.document
Dim productList As Object
Set productList = ieDoc.getElementsByClassName("product-grid--tile")
'==== Test if the website has finish loading the 1st product details
On Error Resume Next
Dim testStatus As String
Do
Err.Clear
testStatus = productList(0).getElementsByClassName("shelfProductTile-descriptionLink")(0).innerText
Loop Until Err.Number = 0
'====
Dim outputArr() As String
ReDim outputArr(1 To productList.Length, 1 To 2) As String
Dim outputIndex As Long
Dim i As Long
For i = 0 To productList.Length - 1
If productList(i).getElementsByClassName("shelfProductTile-descriptionLink").Length <> 0 Then
If Err.Number <> 0 Then
Err.Clear
Exit For
End If
Dim productName As String
Dim productPrice As String
productName = productList(i).getElementsByClassName("shelfProductTile-descriptionLink")(0).innerText
productPrice = Replace(productList(i).getElementsByClassName("price")(0).innerText, vbNewLine, vbNullString)
outputIndex = outputIndex + 1
outputArr(outputIndex, 1) = productName
outputArr(outputIndex, 2) = productPrice
End If
Next i
ReDim Preserve outputArr(1 To outputIndex, 1 To 2) As String
ieObj.Quit
Set ieObj = Nothing
ThisWorkbook.Sheets("Sheet1").Range("A1").Resize(outputIndex, UBound(outputArr, 2)).Value = outputArr
End Sub
Running this will pull the data from the website and paste the output starting from cell A1 in Sheet1, please change the worksheet name and range as you see fits.
Im doing a webform in vb.net I'm consuming a webservice, Which returns me to all the countries
Only have 1 button Enviar that calls the countries.
Imports service_country = WebServiceVB2.country
Protected Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim serv_country As New service_country.country '--Create object'
Dim MyDoc As New System.Xml.XmlDocument
Dim MyXml As String = serv_country.GetCountries() '--Execute procedure from webservice'
MyDoc.LoadXml(MyXml) '--Read Myxml and convert to XML'
Dim SymbolText As String = MyDoc.SelectSingleNode("//NewDataSet/Table/Name").InnerText '--select the node'
Label1.Text = SymbolText
End Sub
My question is How can I select all the values that are inside the 'name'.
Actually it only shows one.
For Example:
Thanks in advance.
This was an interesting problem. Since data is coming as a webpage the open bracket was coming as "& l t ;" while the closing bracket was coming as "& g t ;". So these had to be replaced. I used xml linq to get the names :
Imports System.Xml
Imports System.Xml.Linq
Module Module1
Const URL As String = "http://www.webservicex.net/country.asmx/GetCountries"
Sub Main()
Dim doc1 As XDocument = XDocument.Load(URL)
Dim docStr As String = doc1.ToString()
docStr = docStr.Replace(">", ">")
docStr = docStr.Replace("<", "<")
Dim doc2 As XDocument = XDocument.Parse(docStr)
Dim root As XElement = doc2.Root
Dim defaultNs As XNamespace = root.GetDefaultNamespace()
Dim names() As String = doc2.Descendants(defaultNs + "Name").Select(Function(x) CType(x, String)).ToArray()
End Sub
End Module
Using WebUtility
Imports System.Xml
Imports System.Xml.Linq
Imports System.Text
Imports System.Net
Module Module1
Const URL As String = "http://www.webservicex.net/country.asmx/GetCountries"
Sub Main()
Dim xReader As XmlReader = XmlTextReader.Create(URL)
xReader.MoveToContent()
Dim doc As XDocument = XDocument.Parse(WebUtility.HtmlDecode("<?xml version=""1.0"" encoding=""iso-8859-9"" ?>" & xReader.ReadOuterXml))
Dim root As XElement = doc.Root
Dim defaultNs As XNamespace = root.GetDefaultNamespace()
Dim names() As String = doc.Descendants(defaultNs + "Name").Select(Function(x) CType(x, String)).ToArray()
End Sub
End Module
Is there a way of taking html code for a table and printing out the same table in a word document using VBA (VBA should be able to parse the html code block for a table)?
It is possible to take the contents of the table and copy them into a new table created in Word, however is it possible to recreate a table using the html code and vba?
For any of this, where can one begin to research?
EDIT:
Thanks to R3uK: here is the first portion of the VBA script which reads a line of html code from a file and uses R3uK's code to print it to the excel worksheet:
Private Sub button1_Click()
Dim the_string As String
the_string = Trim(ImportTextFile("path\to\file.txt"))
' still working on removing new line characters
Call PrintHTML_Table(the_string)
End Sub
Public Function ImportTextFile(strFile As String) As String
' http://mrspreadsheets.com/1/post/2013/09/vba-code-snippet-22-read-entire-text-file-into-string-variable.html
Open strFile For Input As #1
ImportTextFile = Input$(LOF(1), 1)
Close #1
End Function
' Insert R3uK's portion of the code here
This could be a good place to start, you will only need to check content after to see if there is any problem and then copy it to word.
Sub PrintHTML_Table(ByVal StrTable as String)
Dim TA()
Dim Table_String as String
Table_String = " " & StrTable & " "
TA = SplitTo2DArray(Table_String, "</tr>", "</td>")
For i = LBound(TA, 1) To UBound(TA, 1)
For j = LBound(TA, 2) To UBound(TA, 2)
ActiveSheet.Cells(i + 1, j + 1) = Trim(Replace(Replace(TA(i, j), "<td>", ""), "<tr>", ""))
Next j
Next i
End Sub
Public Function SplitTo2DArray(ByRef StringToSplit As String, ByRef RowSep As String, ByRef ColSep As String) As String()
Dim Rows As Variant
Dim rowNb As Long
Dim Columns() As Variant
Dim i As Long
Dim maxlineNb As Long
Dim lineNb As Long
Dim asCells() As String
Dim j As Long
' Split up the table value by rows, get the number of rows, and dim a new array of Variants.
Rows = Split(StringToSplit, RowSep)
rowNb = UBound(Rows)
ReDim Columns(0 To rowNb)
' Iterate through each row, and split it into columns. Find the maximum number of columns.
maxlineNb = 0
For i = 0 To rowNb
Columns(i) = Split(Rows(i), ColSep)
lineNb = UBound(Columns(i))
If lineNb > maxlineNb Then
maxlineNb = lineNb
End If
Next i
' Create a 2D string array to contain the data in <Columns>.
ReDim asCells(0 To maxlineNb, 0 To rowNb)
' Copy all the data from Columns() to asCells().
For i = 0 To rowNb
For j = 0 To UBound(Columns(i))
asCells(j, i) = Columns(i)(j)
Next j
Next i
SplitTo2DArray = asCells()
End Function
I have checked Google, and the suggested answers here, but have had no luck unfortunately.
The last thing I need to do is have an email read the rateNbr variable into the email body, but it just comes up empty.
I tried to make Public Function FuncRateCheckFile read as Public Function FuncRateCheckFile(ByVal rateNbr As String), to try and enable it to be called outside the function, but this then breaks the function when it is called elsewhere. :(
Here is the code, with comments as to where I am referring:
Public Function FuncRateCheckFile()
Dim blnContinue As Boolean
Dim strLine As String
Dim strSearchFor, strSearchWrd, LineCount, objFSO, objTextFile, arrLines
Dim dteNow As Date
Dim newDate As String
'//==============================================================================================
'// DECLARED
Dim rateNbr As String
'//==============================================================================================
FuncRateCheckFile = False
blnContinue = True
If blnContinue Then
Const ForReading = 1
'Get todays date and reformat it
dteNow = DateValue(Now)
newDate = Format(dteNow, "dd/MM/yy")
strSearchWrd = newDate
'Read the whole file
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile(m_RateCheckFile, ForReading)
LineCount = 0
Do Until objTextFile.AtEndOfStream
strLine = objTextFile.ReadLine()
If InStr(strLine, strSearchWrd) <> 0 Then
arrLines = Split(strLine, vbCrLf)
LineCount = LineCount + 1
End If
Loop
'Log a message to state how many lines have todays day, and if there are none, log an error
If LineCount <> 0 Then
'//==============================================================================================
'// "rateNbr" IS WHAT I AM TRYING TO GET TO PUT IN THE EMAIL
LogMessage "Rate file date is correct"
rateNbr = "Number of rates for " & newDate & " in the file recieved on " & newDate & " is " & LineCount
LogMessage rateNbr
EmailAdvice2
objTextFile.Close
'//==============================================================================================
Else
blnContinue = False
LogError "Failed to retrieve Current Rate date, please check rate file.."
EmailAdvice
objTextFile.Close
End If
End If
FuncRateCheckFile = blnContinue
LogMessage "Completed Check Rate file"
End Function
Private Function EmailAdvice2()
Dim strSMTPFrom As String
Dim strSMTPTo As String
Dim strSMTPRelay As String
Dim strTextBody As String
Dim strSubject As String
Dim oMessage As Object
'//==============================================================================================
'// DECLARED AGAIN
Dim rateNbr As String
'//==============================================================================================
Set oMessage = CreateObject("CDO.Message")
strSMTPFrom = "no-reply#work.com.au"
strSMTPTo = "me#work.com.au"
strSMTPRelay = "smtp.relay.com"
'//==============================================================================================
'// THIS MAKES THE TEXT BODY BLANK, BUT THE EMAIL STILL SENDS
strTextBody = rateNbr
'//==============================================================================================
strSubject = "Todays rates"
'strAttachment = "full UNC path of file"
oMessage.Configuration.Fields.Item("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2
oMessage.Configuration.Fields.Item("http://schemas.microsoft.com/cdo/configuration/smtpserver") = strSMTPRelay
oMessage.Configuration.Fields.Item("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
oMessage.Configuration.Fields.Update
oMessage.Subject = strSubject
oMessage.From = strSMTPFrom
oMessage.To = strSMTPTo
oMessage.textbody = strTextBody
'oMessage.AddAttachment strAttachment
oMessage.Send
End Function
I am positive that it is blank because I have declared rateNbr under EmailAdvice2() and then not given it anything to fill the variable with. But I don't know how to make it call the variable under FuncRateCheckFile().
Thanks to all for any assistance.
As Plutonix stated, this is a scope issue.
Move the declaration of your 'rateNbr' variable out to class level, and remove the local declarations inside your functions:
Dim rateNbr As String ' <-- out at class level it will be accessible from both functions
Public Function FuncRateCheckFile()
...
' REMOVE both the decalarations of "rateNbr" that are INSIDE your functions
...
End Function
Private Function EmailAdvice2()
...
' REMOVE both the decalarations of "rateNbr" that are INSIDE your functions
...
End Function
I have a simple form created in Visual Studio (VB) which has a data gridview connected to a table in MySQL (hosted in a remote server).
I have the below code to export the grid view to Excel but it takes a really long time to export (around 15 minutes).
The table in MySQL is really small (1000 rows and 60 columns).
Is there a better way to export the complete MySQL table to excel?
PLEASE HELP
CODE:
Dim xlApp As Microsoft.Office.Interop.Excel.Application
Dim xlWorkBook As Microsoft.Office.Interop.Excel.Workbook
Dim xlWorkSheet As Microsoft.Office.Interop.Excel.Worksheet
Dim misValue As Object = System.Reflection.Missing.Value
Dim i As Integer
Dim j As Integer
xlApp = New Microsoft.Office.Interop.Excel.Application
xlWorkBook = xlApp.Workbooks.Add(misValue)
xlWorkSheet = xlWorkBook.Sheets("sheet1")
For i = 0 To DataGridView1.Rows.Count - 1
For j = 0 To DataGridView1.Columns.Count - 1
For k As Integer = 1 To DataGridView1.Columns.Count
On Error Resume Next
xlWorkSheet.Cells(1, k) = DataGridView1.Columns(k - 1).HeaderText
xlWorkSheet.Cells(i + 2, j + 1) = DataGridView1(j, i).Value.ToString()
Next
Next
Next
xlWorkSheet.SaveAs("C:\Users\USERNAME\Desktop\vbexcel.xlsx")
xlWorkBook.Close()
Is there two ways to do this :
using rdlc reports.
using gridview exporting.
first way you need to created rdlc report, then retrieving data into datatable, then call this code:
Dim MyDataSource As ReportDataSource = New ReportDataSource("ReportDataSet", MyDataTable)
reportviewer1.LocalReport.ReportPath = Server.MapPath("MyrdlcReportPath")
rvSmartCardsIssues.LocalReport.EnableExternalImages = True
rvSmartCardsIssues.LocalReport.DataSources.Clear()
rvSmartCardsIssues.LocalReport.DataSources.Add(MyDataSource )
rvSmartCardsIssues.LocalReport.Refresh()
Dim warnings As Warning() = Nothing
Dim streamids As String() = Nothing
Dim mimeType As String = Nothing
Dim encoding As String = Nothing
Dim extension As String = Nothing
Dim bytes As Byte()
bytes = rvSmartCardsIssues.LocalReport.Render("Excel", Nothing, mimeType, encoding, extension, streamids, warnings)
HttpContext.Current.Response.Buffer = True
HttpContext.Current.Response.Clear()
HttpContext.Current.Response.ContentType = mimeType
HttpContext.Current.Response.AddHeader("content-disposition", "attachment; filename=ExportedFileName.xls")
HttpContext.Current.Response.BinaryWrite(bytes)
HttpContext.Current.Response.Flush()
HttpContext.Current.Response.End()
second way that based on this function :
Public Shared Sub ExportGridViewToExcelGridView(ByVal Filename As String, ByRef gvr As GridView, ByRef currentPage As Page)
Dim HtmlForm As System.Web.UI.HtmlControls.HtmlForm = New System.Web.UI.HtmlControls.HtmlForm()
currentPage.Controls.Add(HtmlForm)
HtmlForm.Controls.Add(gvr)
currentPage.Response.Clear()
currentPage.Response.Buffer = True
currentPage.Response.AddHeader("Content-Disposition", "attachment; filename=" & Filename)
currentPage.Response.ContentType = "application/vnd.ms-excel"
currentPage.Response.ContentEncoding = System.Text.Encoding.UTF8
currentPage.Response.Charset = ""
currentPage.EnableViewState = False
Using strwriter As New StringWriter
Dim htmlwrt As HtmlTextWriter = New HtmlTextWriter(strwriter)
HtmlForm.RenderControl(htmlwrt)
htmlwrt.Flush()
currentPage.Response.Write(strwriter.ToString)
currentPage.Response.End()
End Using
End Sub
make sure that when you use second way to set the gridview paging property to false.