Quantcast
Channel: VBForums - Visual Basic .NET
Viewing all articles
Browse latest Browse all 27193

VS 2012 Looping with WebBrowser and DocumentCompleted

$
0
0
As part of a migration project I create a WinForm app to scrape data from several areas of a client’s webpage. On part is complete but the next requires looping thru a series of pages and I’m trying to figure out the best structure for the code to facilitate this.

The WinForm has a visible WebBrowser, an operation ComboBox, and an execute button. My concept was so that when I selected the function and clicked the button it would perform the various routines, accumulating data in a DataTable which I then save to an XML file when complete. I have created one operation, with most of the code in a class, which uses the ByRef to the WebBrowser which loops thru an HTML table of data. Next function has no similar report so I need to loop thru integer sequenced URLs. EG http://company.com/cars.cfm?carID=1, http://company.com/cars.cfm?carID=2 where I want to extract table data from each page until none exist. The next function will be trucks, and so on.

Normally my mind wants to write a “For i = 1 to 1000” loop where i is used for the ID in the URL but of course after a navigate I need to wait before I extract the contents. From what I read the DocumentCompleted event is the best practice but that’s can’t be kept in the loop. What’s more I’m having troubles visualizing how I would stuff that in a class and have multiple classes for each operation. It seems the event handler wants to be in the form but the WebBrowser in that form will be used for different operations. I could put a Select Case in there but now it’s sounding inelegant. Also I saw a post by Jim to make a quasi-loop with 2 routines but again this is feeling icky. I could just give up and create separate forms for each operation but that seems dumb.

Does anyone have some advice on how best to structure this?

As an aside normally I do this with HTTPRequests and RegEx but this time I wanted to try the HTMLDocument DOM to extract text.

Viewing all articles
Browse latest Browse all 27193

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>