Quantcast
Channel: VBForums - Visual Basic .NET
Viewing all articles
Browse latest Browse all 27189

VS 2010 Scraping a google search page for the top 200 search links for a keyword

$
0
0
i want to scrape the top 200 search links from a google page on searching a keyword.

i am using httpwebrequest .
Any other simple way to do it ?

For so far i have this.

Code:

Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/search?num=100&q=" & TextBox1.Text)
                Dim response As System.Net.HttpWebResponse = request.GetResponse
                Dim stream As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
                Dim page As String = stream.ReadToEnd
                Dim regexobj As Regex = New Regex("http://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\\^\\*\(\)_\-\=\+\\\/\?\.\:\;\,]*)?")
                Dim matches As MatchCollection = regexobj.Matches(page)
                For Each item As Match In matches
                    If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then
                        ListBox1.Items.Add(item.Value)
                    End If
                Next

This is what i have tried but it's freezing the program and do not add more than 200 pages.
Code:

            Dim url As Integer = 1
            Do Until url = 10
              For Each item As Match In matches
                    If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then
                        ListBox1.Items.Add(item.Value & url)
                    End If
                Next
                url = url - 1
                Loop

How to fix that ?

Any help would be well.

Thanks

Viewing all articles
Browse latest Browse all 27189

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>