A website has changed so I can't scrape data from it anymore. Just need to change the set line below I believe but have tried a number of things and none have worked. I'm not very knowledgeable about this area I'm afraid but rest of code is working. Eg url is https://www.gurufocus.com/stock/CFWFF/insider and it is the table of insider transactions I am trying to press the dropdown for and change it to 100 instead of 10:
t = Timer
noTable = 0
Do
Set DropDown = doc.querySelectorAll(".el-dropdown-menu__item")
lastDropDrownItemIndex = DropDown.Length - 1
If Timer - t > MAX_WAIT_SEC Then
noTable = 1
Exit Do
End If
Loop While lastDropDrownItemIndex < 1
If noTable = 1 Then GoTo noTableEscape
DropDown.Item(lastDropDrownItemIndex).Click
Thanks
Ok so, not sure what you are after exactly, but the website you're scraping offers an API which in my opinion could probably make your life a lot easier. To put it simply, this means that it provides an easy way to request for data with the use of some parameters embedded in a URL. It returns the requested data in JSON format.
In the following code I will be using the XMLHTTP request method and a JSON Parser. For this you will need the following:
JSON parser , it helps you parse the downloaded data. Follow the installation instructions to import it in your project
A reference to the Microsoft Scripting Runtime library. The JSON parser needs it.
A reference to the Microsoft WinHTTP Services, Version 5.1 library. It lets you use an HTTP request object.
For demonstration purposes, the following code will only print in the immediate window the first entry's name and position. You can modify the code to fit your needs.
Sub test()
Dim req As New WinHttpRequest
Dim jsonResponse As String
Dim jsonParsed As Object
Dim url As String
Dim pageNum As Integer
Dim numPerPage As Integer
pageNum = 1 'You can change this parameter to navigate in different pages
numPerPage = 100 'You can change this parameter to control the number of entries
url = "https://www.gurufocus.com/reader/_api/stocks/OTCPK:CFWFF/insider?page=" & pageNum & "&per_page=" & numPerPage & "&sort=date%7Cdesc"
With req
.Open "GET", url, False
.setRequestHeader "Accept", "application/json, text/plain, */*"
.setRequestHeader "Authorization", ThisWorkbook.Worksheets("The name of your Worksheet").Range("A1").Value 'I have stored a string that is essential to the request in cell A1
.send
jsonResponse = .responseText
End With
Set jsonParsed = JsonConverter.ParseJson(jsonResponse)
Debug.Print jsonParsed("data")(1)("name") 'get the name parameter of the first entry
Debug.Print jsonParsed("data")(1)("position") 'get the position parameter of the first entry
End Sub
Please note that there's a very long string which is essential to the request, which I have stored in cell A1. This string looks like so:
Bearer
eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImp0aSI6ImUxYjAwMmYxMjczMGRiMTBmMmZkYjJkNDk0YTU4NjRmZDZjOWY3ZGI4ZmI1NDY1NTQ2MzZlMGJhNzkxODUxNmY4NTM2ZWIzZDNhODhmN2VmIn0.eyJhdWQiOiIyIiwianRpIjoiZTFiMDAyZjEyNzMwZGIxMGYyZmRiMmQ0OTRhNTg2NGZkNmM5ZjdkYjhmYjU0NjU1NDYzNmUwYmE3OTE4NTE2Zjg1MzZlYjNkM2E4OGY3ZWYiLCJpYXQiOjE1NTkwNzA3OTcsIm5iZiI6MTU1OTA3MDc5NywiZXhwIjoxNTY5NDM4NzkzLCJzdWIiOiIiLCJzY29wZXMiOltdfQ.mZ4DqhUk9YAU6JYDBScF8MJ_zHPyL94bAec7LxZTaWipcWf9uesdGDMDC9v_7W-6zrtXAUWhk4YAL70E5rpPjM7gusYH0RfO48O2PnaV8gsqXoNCFwFBOHuxh109q7X0YsNkfX2wX8m3XigtK9A_YAGID7wxgX96lwzBevsDJ3borHMcJlQtxidF_Bq2D5WPASsuy3jdY80HkOCR1y4eaSIswBEtK5rPj_xy7VXRbYGhLklqw4wgHgq4blfaHnVVmPXf6k8mx45ye8vPecS-w9kjuDOHVn2mvU6mpBzqEpbH4lqpiqmYG7M-CvB1joEAcMQtcilCvsdfKOusoC2MU4_vPtF3Q4ZFVaEcXIQgomdKtFa_XGpCudit45b2rEFacKMUENqLj_sPwYkgM1IPl1lQfR-VpigqnCHPAxVQAPzqwJvS6CxuYOPmvnrx23fBAillP7LtDHwHtlMpgZUjdB5y6IWsia76crM4kbkrKn3zc8xoAGb1fIrgJlY-9hOzrwsmrchantEdYOFZjcMJvhCnlfvnEm6kT2Sdcu4o6TndTZJjrVmD4mb-jNGy4kw_mAx1DfyqR7GLtCVSzcSLKgrrwCJEL22K2bfXH2HExXvgLFbPXivVZJc70TnF9lJmx_dx79cxAm7szFGIdrs56bAC4mdKpvKL3BNmVY-J-G0
The same string should work for you as well.
The result looks like so:
Brown, James Michael
Senior Officer
Each one of the 100 data entries has the following structure:
It's fairly easy to loop through all the entries as well. For example, to print the name of all the entries you would have to do this:
Dim item As Object
For Each item In jsonParsed("data")
Debug.Print item("name")
Next item
Finally, you can also loop through all the parameters of each entry. For example, the following code prints all the parameters and their corresponding values for the first entry:
Dim key As Variant
For Each key In jsonParsed("data")(1).Keys
Debug.Print key & ": " & jsonParsed("data")(1)(key)
Next key
So this way you can basically access any parameter you want for each entry.
I'm trying to modify a link "URL link" using VBA Excel to extract specific Value from Site .
Below the Type of link :
URLhttp://Confidential.eu.airbus.Confidential:Confidential/Confidential/consultation/preViewMP.do?mpId=XXXXXX
What I want is to change mpID=XXXXX with sheet("Feuil1").range("A1").valuebut I didn't succeed , I don't have the right knowledge
So I don't have a clue how to manipulate this URL to open what I enter in range("A1") and look for specific line there and Copy and Past it in my excel File
Anyone could light me with some idea or help to better code this ?
You can get the string from A1 using
Dim str As String: str = ThisWorkbook.Sheets("Feuil1").Range("A1").Value
Then create the URL from what you've stated
Dim myURL as String
' The & symbol concatenates strings. The _ symbol is for line continuation.
myURL = "http://Confidential.eu.airbus.Confidential:Confidential/Confidential/" _
& "consultation/preViewMP.do?" & str
We are in the process of implementing code to read/create 2D bar codes that are starting to show up on our supplier's parts.
We have a need to create the 2D bar codes in MS Access reports and forms. Has anyone had success with the font (IDAutomation) or Active X (dlSoft) solutions out there.
For C#, the open source library "http://barcoderender.codeplex.com/" was suggested. Any thoughts on how successful this was or if anyone has other open-source and/or pay for options.
Thanks,
Anton
Depending on the volume of codes you need to generate, you could use the Google Charts API to generate QR Codes.
Simply add a "Microsoft Web Browser" ActiveX component and the following code to your Form:
Dim Size As Integer
Dim Text As String
Dim URL As String
Size = 200
Text = "This is my test"
' Better to actually use a URL encoding function like those described here:
' http://stackoverflow.com/questions/218181/how-can-i-url-encode-a-string-in-excel-vba
Text = Replace(Text, " ", "%20")
URL = "http://chart.apis.google.com/chart?chs=" & Size & "x" & Size & "&cht=qr&chld=H|0&chl=" & Text
WebBrowser.Navigate (URL)
You can of course change the Size and the Text depending on your need. The Text can also be a value directly from your Form, therefore your data.
I would advise you to check Googles Terms and Services before using it.
I completely escaped the web browser control and the Google API since that functionality is now deprecated from what I can tell. I went with a different free API but the Google API or any other API could be used instead.
In my example I am creating an .png image in the same directory as the application. I have a text box on my form named txtToCode in which I type in any text I want to code. I also have an image control so that the image can be viewed from the form, but you can modify it how you wish:
Private Sub btnCode2_Click()
Call GetQRCode(Me.txtToCode, 150, 150)
End Sub
Sub GetQRCode(Content As String, Width As Integer, Height As Integer)
Dim ByteData() As Byte
Dim XmlHttp As Object
Dim HttpReq As String
Dim ReturnContent As String
Dim EncContent As String
Dim QRImage As String
EncContent = EncodeURL(Content)
HttpReq = "https://api.qrserver.com/v1/create-qr-code/?data=" & EncContent & "&size=" & Width & "x" & Height & ""
Set XmlHttp = CreateObject("MSXML2.XmlHttp")
XmlHttp.Open "GET", HttpReq, False
XmlHttp.Send
ByteData = XmlHttp.responseBody
Set XmlHttp = Nothing
ReturnContent = StrConv(ByteData, vbUnicode)
Call ExportImage(ReturnContent)
End Sub
Private Sub ExportImage(image As String)
Dim FilePath As String
On Error GoTo NoSave
' Build Export Path
FilePath = Application.CurrentProject.Path & "\qr.png"
Open FilePath For Binary As #1
Put #1, 1, image
Close #1
Me.Image3.Picture = FilePath
' Save File Path
Exit Sub
NoSave:
MsgBox "Could not save the QR Code Image! Reason: " & Err.Description, vbCritical, "File Save Error"
End Sub
Private Function EncodeURL(str As String)
Dim ScriptEngine As Object
Dim encoded As String
Dim Temp As String
Temp = Replace(str, " ", "%20")
Temp = Replace(Temp, "#", "%23")
EncodeURL = Temp
End Function
I want to create a classic asp (vbscript) function that replaces all 'returns' that occur between double quotes.
The input string is 'csv' like:
ID;Text;Number
1;some text;20
2;"some text with unwanted return
";30
3;some text again;40
I want to split the string on chr(13) (returns) to create single rows in an array. It works well, except for the unwanted chr(13) that is contained in the text of id 2.
I hope someone could help.
Fundamentally, this is going to be difficult to do as you won't be able to tell whether the carriage return is a valid one or not. Clearly the ones after 20 and 30 are valid.
An approach I would would be to scan through each line in the file and count the commas that occur. If it's less than 3, then append the next line and use the concatenated string. (This of course assumes your CSV structure is consistent and fixed).
What I would really be asking here is why is the CSV like this in the first place? The routine that populates this should really be the one stripping the the CRs out.
Think of a CSV file like a very crude database or spreadsheet. When cosidering the above file, it is clear that the 'Database'/'Spreadsheet' is corrupt.
If the program that generates this is correupting it, then what extent should the reading application goto to correct these defects? I'm not sure that Excel or SQL Server (for example) would go to great lengths to correct a corrupt data source.
Your text file is just like a CSV file but with semicolons not commas. Use ADO to grab the data and it will handle the line breaks in fields.
Specifically (In ASP VBScript):
On Error Resume Next
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001
Set objConnection = Server.CreateObject("ADODB.Connection")
Set objRecordSet = Server.CreateObject("ADODB.Recordset")
strPathtoTextFile = server.mappath(".") 'Path to your text file
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited"""
objRecordset.Open "SELECT * FROM test.txt", _
objConnection, adOpenStatic, adLockOptimistic, adCmdText
Do Until objRecordset.EOF
Response.Write "ID: " & objRecordset.Fields.Item("ID") & "<br>"
Response.Write "Text: " & objRecordset.Fields.Item("Text") & "<br>"
Response.Write "Number: " & objRecordset.Fields.Item("Number") & "<br>"
objRecordset.MoveNext
Loop
Code sample is modified from Microsofts' Much ADO About Text Files.
This script assumes your data text file is in the same directory as it (the asp file). It also needs a schema.ini file in the same directory as your data text file with the data:
[test.txt]
Format=Delimited(;)
Change text.txt in both code samples above to the name of your text file.
If the unwanted CRLF always occurs inside a text field (inside double quotes), it would not be very difficult to use a regular expression to remove these. Vbscript has a regex engine to its disposal: http://authors.aspalliance.com/brettb/VBScriptRegularExpressions.asp
It all depends ofcourse on how familiar you are with Regular Expressions. I couldn't think of the proper syntax off the top of my head, but this is probably quite easy to figure out.
The solution is pretty easy:
str = "Some text..." & chr(13)
str = REPLACE(str,VbCrlf,"")
The secret is use VbCrlf. For me I use a simple function for solve the problem and add this in my framework.
FUNCTION performStringTreatmentRemoveNewLineChar(byval str)
IF isNull(str) THEN
str = ""
END IF
str = REPLACE(str,VbCrlf,"")
performStringTreatmentRemoveNewLineChar = TRIM(str)
END FUNCTION
Of course this will remove all new lines character from this string. Use carrefully.
I need to make a "static html" FAQ-like-document for internal use on a project.
I put all the items in an Access 2007 Database as records (question, answer, category) and then built a report that uses a sub-report to create a table of contents as internal links and then lists all of the questions and answers. This report is a bunch of text-areas with dynamically generated html code(apparently I don't have enough cred to post images yet so http://i115.photobucket.com/albums/n299/SinbadEV/ReportCapture.png)... I just export the report to a text file and then rename it to .html and open it in a browser.
I'm thinking there has to be a less evil way to do this.
I have now used an idea from SinbadEV and awrigley to create professionally looking HTML-reports in MS Access 2007. In my case I had to use yet another trick:
I found out, that due to some bug in MS Access it does not save the report correctly to txt format. Sometimes it drops a lot of information, even though it is displayed on the screen. I have also seen problem, mentioned here that sometimes access mixing lines. It seem to depend on several factors, e.g. whether report and a data span across pages in MS Acess report.
However I found, that exporting to *.rtf does work correctly. Therefore the approach is to craft MS Acess report, which, when saved into text file would create an HTML code (just like described by SinbadEV ), however you 1st need to save it to *rtf. After that you need to use MS Word automation to convert from *.rtf to txt file and to give it .html extention (In reality it does not take too much efforts).
Instead of MS word automation one can probably also use tool like Doxillion Document Converter to convert from rtf to text format from command line.
You can see database with this feature in the Meeting minutes, Issues, Risks, Agreements, Actions, Projects Tracking tool (http://sourceforge.net/projects/miraapt/).
There's an ExportXML method in the Application object, which can export database objects (tables,reports etc.) in XML. You'll need a XSL style sheet or a XSTL document if you want to format it for a browser:
http://msdn.microsoft.com/en-us/library/bb258194(v=office.12).aspx
I'd say this is the "canonical" way to do it. OTOH writing XSL & XSTL isn't like a fun thing to do and if you HTML generator works, then you should simply keep it like it is. (Actually, it's a nice trick IMHO).
I don't see anything inherently "evil" in what you are doing. I wrote an article for (the now defunct magazine) Smart Access that uses a similar technique for a different reason. The HTML report was a by product. Essentially, my technique allows using Access to create very extensive word documents that flow like typed text rather than looking like reports created using boxes.
You can still read the article on MSDN:
Extending Access Reports With Word and HTML
The trick was to generate HTML using a report like you are doing, then using automation, open the .html file in Word and save it as RTF.
We used the technique to create a 300 page directory for the Diocese of York. It worked flawlessly.
Just in case you want to go the VBA way: I wrote a few functions that can make it quite easy:
create queries containing the data you want to output,
then open the query and loop through all records, outputting data to text file using function rRsToXml below.
Option Compare Database
Option Explicit
Function fRsToXml(rs As Recordset, Optional ignorePrefix As String = "zz", _
Optional ignoreNulls As Boolean = False) As String
'<description> Returns an XML string with all fields of the current record,
' using field names as tags.
' Field names starting with "zz" (or other special prefix) are ignored</description>
'<parameters> rs: recordset (byRef, of course)</parameters>
'<author> Patrick Honorez - www.idevlop.com </author>
Dim f As Field, bPrefLen As Byte
Dim strResult As String
bPrefLen = Len(ignorePrefix)
For Each f In rs.Fields
If Left(f.Name, bPrefLen) <> ignorePrefix Then 'zz fields are ignored !
If (Not ignoreNulls) Or (ignoreNulls And Not IsNull(f.Value)) Then
strResult = strResult & xTag(f.Name, f.Value) & vbCrLf
End If
End If
Next f
fRsToXml = strResult
End Function
Function xTag(ByVal sTagName As String, ByVal sValue, Optional SplitLines As Boolean = False) As String
'<description> Create an xml node and returns it as a string </description>
'<parameters> <sTagName> name of the tag </sTagName>
' <sValue> string to embed </sValue>
' <SplitLine> True to include CrLf at the end of each line
' (optional - default = False) </SplitLine></parameters>
'<author> Patrick Honorez - www.idevlop.com </author>
'<note> Make sure sValue does not contains XML forbidden characters ! </note>
'<changelog>
'</changelog>
Dim strNl As String, intAmp
If SplitLines Then
strNl = vbCrLf
Else
strNl = vbNullString
End If
xTag = "<" & sTagName & ">" & strNl & _
Nz(sValue, "") & strNl & _
"</" & sTagName & ">" '& strNl
End Function
Function CleanupStr(strXmlValue) As String
'<description> Replace forbidden char. &'"<> by their Predefined General Entities </description>
'<author> Patrick Honorez - www.idevlop.com </author>
Dim sValue As String
If IsNull(strXmlValue) Then
CleanupStr = ""
Else
sValue = CStr(strXmlValue)
sValue = Replace(sValue, "&", "&") 'do ampersand first !
sValue = Replace(sValue, "'", "'")
sValue = Replace(sValue, """", """)
sValue = Replace(sValue, "<", "<")
sValue = Replace(sValue, ">", ">")
CleanupStr = sValue
End If
End Function
I used to spoof the report generator into making html documents for me but this approach has limitations. Firstly when you run the report, it generates rather ugly html and not a print ready report. There is more work after running the report to transform the report into a nice html document that can be opened in a word processor and then saved as a regular document. LibreOffice often is a better recipient of generated html documents than ms-word but occasionally LibreOffice fails to do the job (for a while it had issues with linked images). Word processors ignore css styles so don't bother with styles, direct formatting still works well, particularly for text is tables. If all the exported data is inside a html table, then use LibreOffice as LibreOffice can generate a table of contents based on h1, h2, h3 headings, whereas ms-word cannot.
These days, I just write the entire report as a procedure in a VBA standard module. I still do not use object oriented code and there is no reason to here. Reports written entirely in VBA can be far more sophisticated that what the standard ms-Access report designer can produce. Report designer reports take a lot of tinkering to get the format just right and this consumes time. For complex reports, the VBA approach is actually faster. A report written in VBA can be run every other second, so it is easy to adjust something such as the column width of a table and to rerun the report to check the output. A html report created with VBA is written out as a html file and the ms-access can issue a shell command to open the report in a web browser. If the browser is already open, the new report opens in a new tab so you can see what the previous version looked like as this version will still be open in another tab.
Write the report in a standard module (not in a form module) and call it from some button-click event on the form. The report should only need to be told what the title is, what the output filename and location are and the data scope that the report should output. The report procedure contains all other logic necessary for creating the report. Below is the calling procedure for triggering a report in one of my applications. The purpose of the calling code is to export a list of geotagged photos in a delimited text file so that I can plot the photo locations on a map. The process for exporting a html file is very similar. Some custom functions are in the code below but the structure should be recognisable.
Private Sub cmdCSV_File_Click()
Dim FolderName As String
Dim FileName As String
Dim ReportTitle As String
Dim SQL As String
Dim FixedFields As String
Dim WhereClause As String
Dim SortOrder As String
'Set destination of exported data
FolderName = InputBox("Please enter name of folder to export to", AppName, mDefaultFolder)
If mPaths.FolderExists(FolderName).Success Then
mDefaultFolder = FolderName 'holds default folder name in case it is needed again
Else
MsgBox "Can't find this folder", vbCritical, AppName
Exit Sub
End If
FileName = CheckTrailingSlash(FolderName) & "PhotoPoints.txt"
'Set Report Title
If Nz(Me.chkAllProjects, 0) Then
ReportTitle = "Photos from all Projects"
ElseIf Nz(Me.SampleID, 0) Then
ReportTitle = "Photos from Sample " & Me.SampleID
ElseIf Nz(Me.SurveyID, 0) Then
ReportTitle = "Photos from Survey " & Me.SurveyID
ElseIf Nz(Me.ProjectID, 0) Then
ReportTitle = "Photos from Project " & Me.ProjectID
Else
MsgBox "Please select a scope before pressing this button", vbExclamation, AppName
Exit Sub
End If
'Update paths to photos
If Have(Me.ProjectID) Then
WhereClause = " (PhotoPath_ProjectID = " & Me.ProjectID & ")" 'also covers sample and survey level selections
Else
WhereClause = " True" 'when all records is selected
End If
Call mPhotos.UpdatePhotoPaths(WhereClause) 'refreshes current paths
'Set fixed parts of SQL statement
FixedFields = "SELECT Photos.*, PhotoPaths.PhotoPath_Alias, PhotoPaths.CurrentPath & Photos.PhotoName AS URL, " _
& "PhotoPaths.CurrentPath & 'Thumbs\' & Photos.PhotoName as Thumb " _
& "FROM Photos INNER JOIN PhotoPaths ON Photos.PhotoPathID = PhotoPaths.PhotoPathID WHERE "
SortOrder = " ORDER BY ProjectID, SurveyID, SampleID, Photo_ID"
'set scope for export
WhereClause = "(((Photos.Latitude) Between -90 And 90) AND ((Photos.Longitude) Between -180 And 180) AND ((Photos.Latitude)<>0) AND ((Photos.Longitude)<>0)) AND " & WhereClause
SQL = FixedFields & WhereClause & SortOrder & ";"
'Export data as a delimited list
FileName = ExportCSV(FileName, SQL)
Call OpenBrowser(FileName)
End Sub
The next bit of code actually writes out the delimited text file (html just has tags instead of pipes). The vertical bar or pipe is used to separate the values rather than a comma in this case as commas may occur in the data. The code works out how many columns there are for itself and puts headings at the top.
Public Function ExportCSV(FileAddress As Variant, SQL As String) As String
If Not gDeveloping Then On Error GoTo procerr
PushStack ("mfiles.ExportCSV")
'Exports a csv file
If Nz(FileAddress, "") = "" Then
ExportCSV = "Failed"
Exit Function
End If
'Create text file:
Dim webfile As Object, w
Set webfile = CreateObject("Scripting.FileSystemObject")
Set w = webfile.CreateTextFile(FileAddress, True)
Dim D As Database, R As Recordset, NumberOfFields As Long, Out As String, i As Long
Set D = CurrentDb()
Set R = D.OpenRecordset(SQL, dbOpenSnapshot)
If R.RecordCount > 0 Then
With R
NumberOfFields = .Fields.Count - 1
'Field headings
For i = 0 To NumberOfFields
If i = 0 Then
Out = .Fields(i).Name
Else
Out = Out & "|" & .Fields(i).Name
End If
Next
w.writeline Out
'Field data
Do Until .EOF
For i = 0 To NumberOfFields
If i = 0 Then
Out = .Fields(i)
Else
Out = Out & "|" & .Fields(i)
End If
Next i
w.writeline Out
.MoveNext
Loop
End With
End If
Set R = Nothing
Set D = Nothing
ExportCSV = FileAddress
exitproc:
PopStack
Exit Function
procerr:
Call NewErrorLog(Err.Number, Err.Description, gCurrentProc, FileAddress & ", " & SQL)
Resume exitproc
End Function
Below is a snippet from the openbrowser function. The rest of the function deals with figuring out where the web browser is, as this varies with the version of windows and whether the browser is 32 or 64 bit.
'Set up preferred browser
If Right(BrowserPath, 9) = "Opera.exe" Then
FilePrefix = "file://localhost/"
ElseIf Right(BrowserPath, 11) = "Firefox.exe" Then
FilePrefix = "file:///"
Else
FilePrefix = ""
End If
'Show report
Instruction = BrowserPath & " " & FilePrefix & WebpageName
TaskSuccessID = Shell(Instruction, vbMaximizedFocus)
This example contains about 90% of the code needed to create a html report that has its scope set by the form that calls it. Hope this gets someone over the hump.