Get First Vid From Youtube VB.NET - html

Im trying to get the first youtube link from youtube or google but I can't get it to work. can someone please help me out?
Dim m As New Regex("<a href=""/watch?v=.*""")
Dim request2 As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("https://www.youtube.com/results?search_query=" + ListBox1.SelectedItem + " " + ListBox2.SelectedItem)
Dim responseyoutube As System.Net.HttpWebResponse = request2.GetResponse
TextBox2.Text = (request2.Address.ToString)
Dim sr As System.IO.StreamReader = New System.IO.StreamReader(responseyoutube.GetResponseStream())
Dim rssourcecodey As String = sr.ReadToEnd
Dim matches As MatchCollection = m.Matches(rssourcecodey)
TextBox1.Text = rssourcecodey
For Each itemcode2 As Match In matches
youtube = itemcode2.Value.Split("=").GetValue(1)
ListBox2.Items.Add(youtube)

? is a special meta char in regex which makes the previous token as optional one (not the one after * or +). So you need to escape the ? symbol in-order to match a literal ? symbol.
Dim m As New Regex("<a href=""/watch[?]v=.*""")
OR
Dim m As New Regex("<a href=""/watch\\?v=.*""")

Related

SSIS Script Task Special Characters

I am trying to get some data from an API, using an SSIS Script Task with Visual Basic code. But I having some problems because I have Portuguese special characters in some fields. I was trying to solve my problems and I read that this could be achieved by enconding the data. I try to do this with any sucess.
Can anyone help me?
Regards
Using c As WebClient = New WebClient()
c.BaseAddress = URL
c.Headers(HttpRequestHeader.Authorization) = "Basic " + Convert.ToBase64String(System.Text.ASCIIEncoding.ASCII.GetBytes(key & ":" & pass))
c.Credentials = New NetworkCredential(key, pass)
Dim xml = c.DownloadString(URL & "CRUser?start=0&count=999&language=English")
'Parse para objetos
Dim doc = XDocument.Parse(xml)
Dim reader = doc.CreateReader()
Dim xRoot As XmlRootAttribute = New XmlRootAttribute()
xRoot.ElementName = "array"
xRoot.IsNullable = True
Dim xmls As New XmlSerializer(GetType(List(Of CRUser)), xRoot)
Dim _data As List(Of CRUser) = xmls.Deserialize(reader)
For Each item As CRUser In _data
'Console.WriteLine(item.id)
'Console.WriteLine(item.firstName)
'..... resto campos injetar no pipelino do SSIS os dados
MyOutputBuffer.AddRow()
MyOutputBuffer.id = item.id
MyOutputBuffer.firstName = item.firstName
Next
End Using

Slicing a string to read a html document in VB

I was hoping someone could help me figure out why this script will not return the link names. I am trying to return a sub-string from 'http://textfiles.com/directory.html' which just writes the link names to the console, but I am struggling. The main problem - as far as I can see - is in the 'do until' loop. The working code outputs the html text to the console more for my sake than anything else (it does this successfully), but this feature may also help you guys understand the total picture I am facing. Maybe after seeing the code/ understanding my goal you guys can see where I am going wrong AND/OR suggest a better method for achieving my goal. Thanks a ton!
Imports System.IO
Imports System.Text
Module Module1
Sub Main()
Dim line As String = ""
Dim lowBound As String = "<a href="""
Dim highBound As String = """>"
Console.WriteLine("Grab link names from textfiles.com")
Console.WriteLine("")
Dim siteName As String = "http://textfiles.com/directory.html"
Dim tmpString As StringBuilder = New StringBuilder
My.Computer.Network.DownloadFile(siteName, "C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html", False, 500)
Dim myReader As StreamReader = New StreamReader("C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html")
While Not IsNothing(line)
line = myReader.ReadLine()
If Not IsNothing(line) Then
tmpString.Append(line)
End If
End While
Dim pageText As String = tmpString.ToString
Console.WriteLine(pageText)
Dim intCounter As Integer = 1
Do Until intCounter >= Len(pageText)
Dim checkSub As String = Mid(pageText, intCounter + 1, (Len(pageText) - intCounter))
Dim positLow As Integer = InStr(checkSub, lowBound)
Dim positHigh As Integer = InStr(checkSub, highBound)
If (positLow > 0 And positHigh > 0) And positLow < positHigh Then
Dim indexLow As Integer = checkSub.IndexOf(lowBound)
Dim indexHigh As Integer = checkSub.IndexOf(highBound)
Dim foundLink As String = checkSub.Substring(indexLow + Len(lowBound), indexHigh - Len(highBound))
Console.WriteLine(foundLink)
intCounter = intCounter + (Len(lowBound) + Len(highBound) + Len(foundLink) - 1)
Else
intCounter = Len(pageText)
End If
Loop
Console.ReadLine()
myReader.Close()
My.Computer.FileSystem.DeleteFile("C:\~\VisualStudio\BeginnerPractice\TextFileDotCom_GrabLinkNames\TextFileDotCom_GrabLinkNames\bin\debug\directory.html")
End Sub
End Module

Regex matching first occurrence only?

This is the problem:
Code:
Dim findtext As String = "(?<=<hello>)(.*?)(?=</hello>)"
Dim myregex As String = TextBox1.Text
Dim doregex As MatchCollection = Regex.Matches(myregex, findtext)
MsgBox(doregex(0).ToString)
TextBox1:
<hello>1</hello>
<hello>2</hello>
<hello>3</hello>
So, when i run the code, it shows MsgBox with 1. Why only 1? Why not 2 and 3?
I added ? to .*, but it's still the same.
The MatchCollection contains multiple items but you are only retrieving the first one with doregex(0). Use a loop to get to the others:
Dim doregex As MatchCollection = Regex.Matches(myregex, findtext)
For Each match As Match In doregex
MsgBox(match.ToString)
Next
EDIT:
To combine the values, append them to a String within the loop before you use it:
Dim doregex As MatchCollection = Regex.Matches(myregex, findtext)
Dim matches As String = "" ' consider StringBuilder if there are many matches
For Each match As Match In doregex
matches = matches + match.ToString + " "
Next
MsgBox(matches)
Because you show only the first item in MatchCollection , you can use For Each loop to show all items like this :
For Each item In doregex
MsgBox(item.ToString)
Next
You can combine items with many way, belows one of them :
Dim result As String = String.Empty
For Each item In doregex
result = String.Format("{0} {1}", result, item)
Next
MsgBox(result)
Use LINQ:
Dim text_box_text = "<hello>1</hello>" & vbLf & "<hello>2</hello>" & vbLf & "<hello>3</hello>"
Dim findtext As String = "(?<=<hello>)(.*?)(?=</hello>)"
Dim my_matches_1 As List(Of String) = System.Text.RegularExpressions.Regex.Matches(text_box_text, findtext) _
.Cast(Of Match)() _
.Select(Function(m) m.Value) _
.ToList()
MsgBox(String.Join(vbLf, my_matches_1))
Also, with this code, you do not need to use the resource-consuming lookarounds. Change the regex to
Dim findtext As String = "<hello>(.*?)</hello>"
and use .Select(Function(m) m.Groups(1).Value) instead of .Select(Function(m) m.Value).

vb.net from string to listbox line by line

i made an webrequestto get an htmlcode of an website and then i extract the
the wanted links with htmlagilitypack
like this :
'webrequest'
Dim rt As String = TextBox1.Text
Dim wRequest As WebRequest
Dim WResponse As WebResponse
Dim SR As StreamReader
wRequest = FtpWebRequest.Create(rt)
WResponse = wRequest.GetResponse
SR = New StreamReader(WResponse.GetResponseStream)
rt = SR.ReadToEnd
TextBox2.Text = rt
'htmlagility to extract the links'
Dim htmlDoc1 As New HtmlDocument()
htmlDoc1.LoadHtml(rt)
Dim links = htmlDoc1.DocumentNode.SelectNodes("//*[#id='catlist-listview']/ul/li/a")
Dim hrefs = links.Cast(Of HtmlNode).Select(Function(x) x.GetAttributeValue("href", ""))
'join the `hrefs`, separated by newline, into one string'
textbox3.text = String.Join(Environment.NewLine, hrefs)
the links are like this :
http://wantedlink1
http://wantedlink2
http://wantedlink3
http://wantedlink4
http://wantedlink5
http://wantedlink6
http://wantedlink7
Now i want to add every line in the string to listbox instead of textbox
one item for each line
THERE IS ABOUT 400 http://wantedlink
hrefs in your case already contained IEnumerable(Of String). Joining them into one string and then split it again to make it work is weird. Since String.Split() returns array, maybe you only need to project hrefs into array to make .AddRange() to work :
ListBox1.Items.AddRange(hrefs.ToArray())
Use the AddRange method of the listbox's items collection and pass it the lines array of the textbox.
AddRange
Lines
Hint: It's one line of code.
its ok i find the answer
Dim linklist = String.Join(Environment.NewLine, hrefs)
Dim parts As String() = linklist.Split(New String() {Environment.NewLine},
StringSplitOptions.None)
ListBox1.Items.AddRange(parts)
this add all the 400 links to the listbox

Trying to use a HTML file as my email body using iMsg in VB.NET

I've wrote a script to create a HTML file based on a SQL Query.... It has become necessary to have that HTML be emailed. Most of our execs use blackberry's and I want to send the HTML file as the body. I have found a round about way to get this done, by adding a WebBrowser, and having the web browser then load the file, and then using the below code to send. The problem i'm facing is if I automate the code fully, it will only email part of the HTML document, now if I add a button, and make it do the email function, it sends correctly. I have added a wait function in several different location, thinking it may be an issue with the HTML not being fully created before emailing. I have to get this 100% automated. Is there a way I can use the .HTMLBody to link to the actual HTML file stored on the C:(actual path is C:\Turnover.html). Thanks all for any help.
Public Sub Email()
Dim strdate
Dim iCfg As Object
Dim iMsg As Object
strdate = Date.Today.TimeOfDay
iCfg = CreateObject("CDO.Configuration")
iMsg = CreateObject("CDO.Message")
With iCfg.Fields
.Item("http://schemas.microsoft.com/cdo/configuration/sendusing") = 1
.Item("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
.Item("http://schemas.microsoft.com/cdo/configuration/smtpserver") = "xxxxx.com"
.Item("http://schemas.microsoft.com/cdo/configuration/smtpauthenticate") = 1
.Item("http://schemas.microsoft.com/cdo/configuration/sendemailaddress") = """Turnover Report"" <TurnoverReports#xxxxx.com>"
.Update()
End With
With iMsg
.Configuration = iCfg
.Subject = "Turnover Report"
.To = "xxxxx#xxxxx.com"
'.Cc = ""
.HTMLBody = WebBrowserReportView.DocumentText
.Send()
End With
iMsg = Nothing
iCfg = Nothing
End Sub
used the below function to read in a local html file. then set
TextBox2.Text = getHTML("C:\Turnover2.html")
and also
.HTMLBody = TextBox2.Text
Private Function getHTML(ByVal address As String) As String
Dim rt As String = ""
Dim wRequest As WebRequest
Dim wResponse As WebResponse
Dim SR As StreamReader
wrequest = WebRequest.Create(address)
wResponse = wrequest.GetResponse
SR = New StreamReader(wResponse.GetResponseStream)
rt = SR.ReadToEnd
SR.Close()
Return rt
End Function