Quantcast
Channel: VBForums - Visual Basic .NET
Viewing all articles
Browse latest Browse all 27199

VS 2012 How to do scraping off an existing web app?

$
0
0
Hi!

I have been tasked with building a PCL (portable class library) that will fetch data from an existing web page (built ages ago in java I think). It is an https web page and I have to log in to it. They want me to write a Model (with a bunch of classes based on mvvm light) for all the things that is presented on this web page, since there are no other APis such as json for this, there is only this web application I have to scrape data off from.

Just exactly how do I do this in the best way? There are 4 different web pages I need to fetch data from, and the data is mostly presented in html tables from what I can understand.

My questions are like this:

1) How do I pass along credentials(e.g log into the app, and does https pose a problem? They have set up a dev account I can use to log in.

2) How do I parse the html response I get when I do a request from a page? I need to extract data from comboboxes, forms/tables and also I have to extract pictures as binary image objects because the model will not use urls for images but the actual object as a .net image bitmap. What pattern/method is the best?

I guess I have to use the webclient from nuget. But it feels really complicated to "read" the huge html response string I get, can I use some clever LINQ or 3rd party library to basically find the html tag I look for and read data from it? I will build up the model objects based on this text. On this I am clueless. the response is basically just one big bunch of html, it snot nicely formatted like xml or json... how do I query/parse it to get exactly the info I need???

The model items are fairly simple, such as company, department, person, task and so on.

Apparently they have already done this once for an iphone app, but they can't find the source code for this and I have to get started, they need the pcl finished in about 2 weeks.

Please help me out here with some hints, tips, suggestions. Step 2 is to build a windows phone 8 app that is based on this PCL and step 3 is a windows 8 store app...

kind regards
Henrik - who knows a little bit about requests, responses, posts, gets and http and html, but not nearly enough to interact with a wewb page just through vb.net code :)

Viewing all articles
Browse latest Browse all 27199

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>