VB.Net Regex􀀀􀀀 xml 􀀀􀀀whitebox ssrs pdf - html

􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀I have a SSRS (2008 r2) report which outputs to a PDF. The report takes a string (which is originally in HTML format) and uses a custom VB function to remove HTML, whitespace and XML characters using regular expressions. The issue is I still get a whitebox character left in the resulting string. It looks like the following symbol:
􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀
􀀀􀀀□
The VB function I have is as follows:
Public Shared Function GetNotes(ByVal strNotes As String) As SqlString
' Gets notes within HTML tags
Dim s As String
Try
s = System.Text.RegularExpressions.Regex.Replace(strNotes, "<.*?\n?.*?>", " ")
s = System.Text.RegularExpressions.Regex.Replace(s, " +", " ")
s = System.Text.RegularExpressions.Regex.Replace(s, "<[^<>]*?>", " ")
s = System.Text.RegularExpressions.Regex.Replace(s, "[\t\r\n] ", "")
s = s.Replace("&", "&")
s = s.Replace(" ", "")
s = s.Trim()
Catch ex As Exception
Return New SqlString("Description: ")
End Try
Return New SqlString(s)
End Function
What should I add to remove this whitebox?

As per your comment, the character appears only at the end of the string.
You may easily use TrimEnd for this purpose:
Dim s As String = "Some text with □"
s = s.TrimEnd("□")
Or perhaps, this will also work (since the box is \u25A1 character):
s = Regex.Replace(s, "[\u25A1]", String.Empty)
Output:

Related

Display values of all Parameters dynamically in textbox

We have a number of historic SSRS reports and would like to add an optional Glossary as the last page of output showing the values of all the parameters (including multiselects).
Is there any generic way of doing this in code or an assembly, or do I have to hand-crank Expressions that are specific to each report? e.g. I know that this kind of Expression will work, but will be laborious if i have to do it per report:
="Param1:"+CStr(Parameters!p1.Value) + vbCrLF
+ "Param2:" +CStr(Parameters!p2.Value) + vbCrLF
+ "Param3:" +CStr(Parameters!p3.Value)
Thanks,
MrHH
You can do this by passing the parameter to a custom function. I built a simple sample with a list of countries as the parameter.
Add the following function to the report's code section.
Public Function ListSelectedParaValues(ByVal p as Parameter) as String
Dim pList as String
Dim i as Integer
If p.IsMultiValue then
For i = 0 to p.Count-1
pList = pList + "Value " + cstr(i) + ": " + CStr(p.Value(i)) + " " + chr(13) + chr(10)
Next
Else
pList = CStr(p.Value)
End If
Return pList
End Function
To test this, add a textbox and set it's expression to
=Code.ListSelectedParaValues(Parameters!Countries)
Obviously you need to swap out Countries with thename of your parameter, but do NOT append .Value as you normally would, it will fail if you do, you just need the parameter name.
The output looks list this

display number of rows found from data table in listview

I am trying to display the number of rows in a listview. I tried this code but instead of working, it throws the error below. I am using mysql for a back end:
error:
System.InvalidCastException: Conversion from string " '" to type
'Double' is not valid. ---> System.FormatException: Input string was
not in a correct format
VB Code:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Try
ViewState("Data") = ""
Using con As New MySqlConnection(constr)
Using cmd As New MySqlCommand("SELECT * FROM school")
Using sda As New MySqlDataAdapter()
cmd.Connection = con
sda.SelectCommand = cmd
cmd.CommandTimeout = 0
Using dt As New DataTable()
sda.Fill(dt)
ViewState("Data") = dt
schoollists.DataSource = dt
schoollists.DataBind()
End Using
End Using
End Using
End Using
countResult.Text = (" '" + schoollists.Items.Count + "';")
Catch ex As Exception
Response.Write(ex)
End Try
End Sub
In VB.NET using the + Operator to concatenate a string and a number yields unexpected results (or better an exception as you can see)
Using + between a string and a number results in the attempt to convert the string " '" (in your case) to a number and, of course, this will fail.
To be on the safe side, the correct operator to concatenate strings in VB.NET is the & operator
countResult.Text = (" '" & schoollists.Items.Count & "';")
As a side note, this will happen because you have, in your project settings, the OPTION STRICT set to OFF. With this configuration the compiler cannot catch this problem and you cannot see the error until you hit it at runtime. I strongly suggest to set OPTION STRICT to ON also if, initially, you will have a lot of code to fix.
A detailed explanation of the behavior of the plus operator when used with numbers and strings is present in the REMARKS section of the + Operator on MSDN

Is there a better way to extract HTML code using Visual Basic

I'm trying to extract some HTML code here, I only want the final String to say
'Entity B'. Is there a better way to do this than what I have done here?
Also this is a format for many entries, so Entity B wont always be Entity B and same for Entity C
SMethod = "<b>Entity B<br/>Entity C</b>"
SMethod = SMethod.Replace("</b>", "</c>")
SMethod = SMethod.Replace("<br/>", "</b><c>")
SMethod = "<a>" & SMethod & "</a>"
Dim ShippingMethod As XDocument = XDocument.Parse(SMethod)
SMethod = ShippingMethod.Element("a").Element("b").Value.Trim
I'm not 100% clear what your end game is, but as for the first sentence of your question where you want to take a string with HTML code in it and remove all the code, this function will remove any tag enclosed in <>:
Public Function RemoveHTML(ByVal input As String) As String
While InStr(input, "<") > 0
Dim tagStart As Integer = InStr(input, "<")
Dim tagEnd As Integer = InStr(input, ">")
input = Left(input, tagStart - 1) & Right(input, Len(input) - tagEnd)
End While
Return input
End Function
And if you're also trying to trim off anything after the <br/> tag, this will do that:
Public Function OneEntity(ByVal input As String) As String
If InStr(input, "<br/>" Then
Dim parts() As String = Split(input, "<br/>")
Return RemoveHTML(parts(0))
Else
Return RemoveHTML(input)
End If
End Function

How to Remove Stop Words from a string using Visual Basic?

I'm looking to find a way to remove stop words using a function in Visual Basic inside my Access DB.
Today I'm just doing several replace but I know it's not the right way as I wouldn't know if I'm removing the Stop Word as a word or within a word.
Any help would be great, I just cannot find any way to do this on VB.
Okay, you mean something like this, right?
OutputString = Replace("They answered the question", "the", "")
This replaces all occurrences of "the" from the phrase, including part of the word "They".
The simplest solution would be to put spaces before and after the word to replace:
OutputString = Replace("They answered the question", " the ", "")
This works for the phrase in my above example, but it won't work when the word occurs at the beginning or at the end of the phrase.
For these cases, you need to do more. Something like this:
Public Function RemoveStopWords( _
ByVal Phrase As String, _
ByVal WordToRemove As String _
) As String
Dim RetVal As String
Dim Tmp As String
'remove the word in the middle of the phrase
RetVal = Replace(Phrase, " " & WordToRemove & " ", " ")
'remove the word at the beginning
Tmp = WordToRemove & " "
If Left(RetVal, Len(Tmp)) = Tmp Then
RetVal = Mid(RetVal, Len(Tmp) + 1)
End If
'remove the word at the end
Tmp = " " & WordToRemove
If Right(RetVal, Len(Tmp)) = Tmp Then
RetVal = Left(RetVal, Len(RetVal) - Len(Tmp))
End If
RemoveStopWords = RetVal
End Function
This works as long as the words in the phrase are always separated with blanks.
When there can be other separators than blanks, you have to do even more.
For example, instead of hardcoding the blanks in the function, you could loop over a list of separators and execute the function for each one.
I won't show this as code now, but you get the idea.

Better way to implement an Access 2007 "HTML Report"

I need to make a "static html" FAQ-like-document for internal use on a project.
I put all the items in an Access 2007 Database as records (question, answer, category) and then built a report that uses a sub-report to create a table of contents as internal links and then lists all of the questions and answers. This report is a bunch of text-areas with dynamically generated html code(apparently I don't have enough cred to post images yet so http://i115.photobucket.com/albums/n299/SinbadEV/ReportCapture.png)... I just export the report to a text file and then rename it to .html and open it in a browser.
I'm thinking there has to be a less evil way to do this.
I have now used an idea from SinbadEV and awrigley to create professionally looking HTML-reports in MS Access 2007. In my case I had to use yet another trick:
I found out, that due to some bug in MS Access it does not save the report correctly to txt format. Sometimes it drops a lot of information, even though it is displayed on the screen. I have also seen problem, mentioned here that sometimes access mixing lines. It seem to depend on several factors, e.g. whether report and a data span across pages in MS Acess report.
However I found, that exporting to *.rtf does work correctly. Therefore the approach is to craft MS Acess report, which, when saved into text file would create an HTML code (just like described by SinbadEV ), however you 1st need to save it to *rtf. After that you need to use MS Word automation to convert from *.rtf to txt file and to give it .html extention (In reality it does not take too much efforts).
Instead of MS word automation one can probably also use tool like Doxillion Document Converter to convert from rtf to text format from command line.
You can see database with this feature in the Meeting minutes, Issues, Risks, Agreements, Actions, Projects Tracking tool (http://sourceforge.net/projects/miraapt/).
There's an ExportXML method in the Application object, which can export database objects (tables,reports etc.) in XML. You'll need a XSL style sheet or a XSTL document if you want to format it for a browser:
http://msdn.microsoft.com/en-us/library/bb258194(v=office.12).aspx
I'd say this is the "canonical" way to do it. OTOH writing XSL & XSTL isn't like a fun thing to do and if you HTML generator works, then you should simply keep it like it is. (Actually, it's a nice trick IMHO).
I don't see anything inherently "evil" in what you are doing. I wrote an article for (the now defunct magazine) Smart Access that uses a similar technique for a different reason. The HTML report was a by product. Essentially, my technique allows using Access to create very extensive word documents that flow like typed text rather than looking like reports created using boxes.
You can still read the article on MSDN:
Extending Access Reports With Word and HTML
The trick was to generate HTML using a report like you are doing, then using automation, open the .html file in Word and save it as RTF.
We used the technique to create a 300 page directory for the Diocese of York. It worked flawlessly.
Just in case you want to go the VBA way: I wrote a few functions that can make it quite easy:
create queries containing the data you want to output,
then open the query and loop through all records, outputting data to text file using function rRsToXml below.
Option Compare Database
Option Explicit
Function fRsToXml(rs As Recordset, Optional ignorePrefix As String = "zz", _
Optional ignoreNulls As Boolean = False) As String
'<description> Returns an XML string with all fields of the current record,
' using field names as tags.
' Field names starting with "zz" (or other special prefix) are ignored</description>
'<parameters> rs: recordset (byRef, of course)</parameters>
'<author> Patrick Honorez - www.idevlop.com </author>
Dim f As Field, bPrefLen As Byte
Dim strResult As String
bPrefLen = Len(ignorePrefix)
For Each f In rs.Fields
If Left(f.Name, bPrefLen) <> ignorePrefix Then 'zz fields are ignored !
If (Not ignoreNulls) Or (ignoreNulls And Not IsNull(f.Value)) Then
strResult = strResult & xTag(f.Name, f.Value) & vbCrLf
End If
End If
Next f
fRsToXml = strResult
End Function
Function xTag(ByVal sTagName As String, ByVal sValue, Optional SplitLines As Boolean = False) As String
'<description> Create an xml node and returns it as a string </description>
'<parameters> <sTagName> name of the tag </sTagName>
' <sValue> string to embed </sValue>
' <SplitLine> True to include CrLf at the end of each line
' (optional - default = False) </SplitLine></parameters>
'<author> Patrick Honorez - www.idevlop.com </author>
'<note> Make sure sValue does not contains XML forbidden characters ! </note>
'<changelog>
'</changelog>
Dim strNl As String, intAmp
If SplitLines Then
strNl = vbCrLf
Else
strNl = vbNullString
End If
xTag = "<" & sTagName & ">" & strNl & _
Nz(sValue, "") & strNl & _
"</" & sTagName & ">" '& strNl
End Function
Function CleanupStr(strXmlValue) As String
'<description> Replace forbidden char. &'"<> by their Predefined General Entities </description>
'<author> Patrick Honorez - www.idevlop.com </author>
Dim sValue As String
If IsNull(strXmlValue) Then
CleanupStr = ""
Else
sValue = CStr(strXmlValue)
sValue = Replace(sValue, "&", "&") 'do ampersand first !
sValue = Replace(sValue, "'", "&apos;")
sValue = Replace(sValue, """", """)
sValue = Replace(sValue, "<", "<")
sValue = Replace(sValue, ">", ">")
CleanupStr = sValue
End If
End Function
I used to spoof the report generator into making html documents for me but this approach has limitations. Firstly when you run the report, it generates rather ugly html and not a print ready report. There is more work after running the report to transform the report into a nice html document that can be opened in a word processor and then saved as a regular document. LibreOffice often is a better recipient of generated html documents than ms-word but occasionally LibreOffice fails to do the job (for a while it had issues with linked images). Word processors ignore css styles so don't bother with styles, direct formatting still works well, particularly for text is tables. If all the exported data is inside a html table, then use LibreOffice as LibreOffice can generate a table of contents based on h1, h2, h3 headings, whereas ms-word cannot.
These days, I just write the entire report as a procedure in a VBA standard module. I still do not use object oriented code and there is no reason to here. Reports written entirely in VBA can be far more sophisticated that what the standard ms-Access report designer can produce. Report designer reports take a lot of tinkering to get the format just right and this consumes time. For complex reports, the VBA approach is actually faster. A report written in VBA can be run every other second, so it is easy to adjust something such as the column width of a table and to rerun the report to check the output. A html report created with VBA is written out as a html file and the ms-access can issue a shell command to open the report in a web browser. If the browser is already open, the new report opens in a new tab so you can see what the previous version looked like as this version will still be open in another tab.
Write the report in a standard module (not in a form module) and call it from some button-click event on the form. The report should only need to be told what the title is, what the output filename and location are and the data scope that the report should output. The report procedure contains all other logic necessary for creating the report. Below is the calling procedure for triggering a report in one of my applications. The purpose of the calling code is to export a list of geotagged photos in a delimited text file so that I can plot the photo locations on a map. The process for exporting a html file is very similar. Some custom functions are in the code below but the structure should be recognisable.
Private Sub cmdCSV_File_Click()
Dim FolderName As String
Dim FileName As String
Dim ReportTitle As String
Dim SQL As String
Dim FixedFields As String
Dim WhereClause As String
Dim SortOrder As String
'Set destination of exported data
FolderName = InputBox("Please enter name of folder to export to", AppName, mDefaultFolder)
If mPaths.FolderExists(FolderName).Success Then
mDefaultFolder = FolderName 'holds default folder name in case it is needed again
Else
MsgBox "Can't find this folder", vbCritical, AppName
Exit Sub
End If
FileName = CheckTrailingSlash(FolderName) & "PhotoPoints.txt"
'Set Report Title
If Nz(Me.chkAllProjects, 0) Then
ReportTitle = "Photos from all Projects"
ElseIf Nz(Me.SampleID, 0) Then
ReportTitle = "Photos from Sample " & Me.SampleID
ElseIf Nz(Me.SurveyID, 0) Then
ReportTitle = "Photos from Survey " & Me.SurveyID
ElseIf Nz(Me.ProjectID, 0) Then
ReportTitle = "Photos from Project " & Me.ProjectID
Else
MsgBox "Please select a scope before pressing this button", vbExclamation, AppName
Exit Sub
End If
'Update paths to photos
If Have(Me.ProjectID) Then
WhereClause = " (PhotoPath_ProjectID = " & Me.ProjectID & ")" 'also covers sample and survey level selections
Else
WhereClause = " True" 'when all records is selected
End If
Call mPhotos.UpdatePhotoPaths(WhereClause) 'refreshes current paths
'Set fixed parts of SQL statement
FixedFields = "SELECT Photos.*, PhotoPaths.PhotoPath_Alias, PhotoPaths.CurrentPath & Photos.PhotoName AS URL, " _
& "PhotoPaths.CurrentPath & 'Thumbs\' & Photos.PhotoName as Thumb " _
& "FROM Photos INNER JOIN PhotoPaths ON Photos.PhotoPathID = PhotoPaths.PhotoPathID WHERE "
SortOrder = " ORDER BY ProjectID, SurveyID, SampleID, Photo_ID"
'set scope for export
WhereClause = "(((Photos.Latitude) Between -90 And 90) AND ((Photos.Longitude) Between -180 And 180) AND ((Photos.Latitude)<>0) AND ((Photos.Longitude)<>0)) AND " & WhereClause
SQL = FixedFields & WhereClause & SortOrder & ";"
'Export data as a delimited list
FileName = ExportCSV(FileName, SQL)
Call OpenBrowser(FileName)
End Sub
The next bit of code actually writes out the delimited text file (html just has tags instead of pipes). The vertical bar or pipe is used to separate the values rather than a comma in this case as commas may occur in the data. The code works out how many columns there are for itself and puts headings at the top.
Public Function ExportCSV(FileAddress As Variant, SQL As String) As String
If Not gDeveloping Then On Error GoTo procerr
PushStack ("mfiles.ExportCSV")
'Exports a csv file
If Nz(FileAddress, "") = "" Then
ExportCSV = "Failed"
Exit Function
End If
'Create text file:
Dim webfile As Object, w
Set webfile = CreateObject("Scripting.FileSystemObject")
Set w = webfile.CreateTextFile(FileAddress, True)
Dim D As Database, R As Recordset, NumberOfFields As Long, Out As String, i As Long
Set D = CurrentDb()
Set R = D.OpenRecordset(SQL, dbOpenSnapshot)
If R.RecordCount > 0 Then
With R
NumberOfFields = .Fields.Count - 1
'Field headings
For i = 0 To NumberOfFields
If i = 0 Then
Out = .Fields(i).Name
Else
Out = Out & "|" & .Fields(i).Name
End If
Next
w.writeline Out
'Field data
Do Until .EOF
For i = 0 To NumberOfFields
If i = 0 Then
Out = .Fields(i)
Else
Out = Out & "|" & .Fields(i)
End If
Next i
w.writeline Out
.MoveNext
Loop
End With
End If
Set R = Nothing
Set D = Nothing
ExportCSV = FileAddress
exitproc:
PopStack
Exit Function
procerr:
Call NewErrorLog(Err.Number, Err.Description, gCurrentProc, FileAddress & ", " & SQL)
Resume exitproc
End Function
Below is a snippet from the openbrowser function. The rest of the function deals with figuring out where the web browser is, as this varies with the version of windows and whether the browser is 32 or 64 bit.
'Set up preferred browser
If Right(BrowserPath, 9) = "Opera.exe" Then
FilePrefix = "file://localhost/"
ElseIf Right(BrowserPath, 11) = "Firefox.exe" Then
FilePrefix = "file:///"
Else
FilePrefix = ""
End If
'Show report
Instruction = BrowserPath & " " & FilePrefix & WebpageName
TaskSuccessID = Shell(Instruction, vbMaximizedFocus)
This example contains about 90% of the code needed to create a html report that has its scope set by the form that calls it. Hope this gets someone over the hump.