Tuesday, December 15, 2009

Web Retriever: Part 1

Yesterday, after I finished a few finals, I decided that I wanted to start working on a basic web scraping program. It just annoyed me that I spend a lot of my development time offline, but yet there are so many good references online (www.cplusplus.com, java apis, msdn library, etc.). So, I started work on my WebRetriever program, which will crawl any provided web page and retrieve all content from it that I can find through hyperlinks. The first step was to create a URL class, which will do the basic URL parsing and also enable me to be able to view certain pieces of a URL with ease.

The C# URL Code can be found here: URL.cs

No comments:

Post a Comment