If I had the time...

by Ethan Kershaw
October 11, 2011 12:22 AM

I’ve been an Android user for about two years now. Seeing Lance and Zachary hard at work on Android development recently has me really wanting to do some personal Android work. Over the past two years I’ve had a lot of ideas for things I would like to make for my phone, but have never really found the time to dive into learning Android.

Whenever I do get around to trying my hand with Android development, I will probably try to make some of the apps I’ve been thinking of since I started using my Droid. I’ve already started to learn iOS development thanks to one of my classes, so also knowing how to develop for Android would be a good tool to have.

Since Lance and Zachary are already getting a good knowledge of Android, maybe when I get around to learning it I can get help from them. More realistically though, this is probably something that’s going to have to wait until a break from school.

Crawling FHU.edu

by Michael Clark
October 4, 2011 8:07 PM

A project that has been on my list for a while now has been a broken link and spell check application that we can use to verify the content on the FHU website. As it currently stands, the application back-end is probably 75% complete.

The broken link portion successfully crawls the website (any website for that matter) and returns the status code for it and every link on the page. This has enabled us to find instances where a typo was made when editing CMS pages or a page existed that no longer exists. The broken link portion currently only returns status codes of 200 (OK), 301 (moved permanently), 404 (not found), or 500 (internal server error). If the page returns anything but those, the applications records a status code of 99 noting that further investigation is needed.

The spellchecking portion is what I'm currently working on and has been a headache to say the least. I'm using the Hunspell dictionary with the NHunspell .NET wrapper. Hunspell is the same dictionary used by LibreOffice, Mozilla, Eclipse, Google Chrome and Mac OS X Snow Leopard. The biggest hurdle at this moment is parsing page content so that things like HTML tags are not checked. This has been problematic just because of the sheer number of cases to find and fix.

To parse HTML content, I've come across a free .NET library called HTML Agility Pack. The library used Xpath syntax and enables me to easily select different sections of code (nodes). From there I can either remove the nodes, split them, or add to them.

Once I finish the back-end portion, I'll begin working on a front-end that will grab data from the database and generate reports detailing the broken links and misspelled words on the website. The back-end will be a task application that runs periodically.