i want to scrape the top 200 search links from a google page on searching a keyword.
i am using httpwebrequest .
Any other simple way to do it ?
For so far i have this.
This is what i have tried but it's freezing the program and do not add more than 200 pages.
How to fix that ?
Any help would be well.
Thanks
i am using httpwebrequest .
Any other simple way to do it ?
For so far i have this.
Code:
Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/search?num=100&q=" & TextBox1.Text)
Dim response As System.Net.HttpWebResponse = request.GetResponse
Dim stream As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim page As String = stream.ReadToEnd
Dim regexobj As Regex = New Regex("http://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\\^\\*\(\)_\-\=\+\\\/\?\.\:\;\,]*)?")
Dim matches As MatchCollection = regexobj.Matches(page)
For Each item As Match In matches
If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then
ListBox1.Items.Add(item.Value)
End If
Next
Code:
Dim url As Integer = 1
Do Until url = 10
For Each item As Match In matches
If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then
ListBox1.Items.Add(item.Value & url)
End If
Next
url = url - 1
Loop
Any help would be well.
Thanks