I recently discovered the 50apps challenge, a year long challenge that publishes weekly programming exercises. I hope to be able to participate in as many of the exercises as my schedule permits. The first three weeks of the challenge were focused on exploring the Python programming language. These are my notes.
I wrote a web crawler in Python. It works by getting the contents of a web page, finding all the links and following them to a specified depth, while scanning for some search text.
I found Python pretty straightforward to work with despite not having used or even read it before. I was surprised that I managed to complete the exercise in two hours. The documentation was good. Finding the regex method to use took the most time, and having to explicitly cast was annoying.
I created a Django website for the web crawler created the week before.
I used Django, which is a Python web framework. It was a larger framework than I was expecting, coming in over four megabytes, but it did seem to have a lot of features. It took a little longer than I wanted to get something going: I spent two hours getting a basic form and another two hours adding some more advanced features. I personally prefer Sinatra-style frameworks like Express as they seem to make more sense to me.
I made my application reuse the code I wrote the week before. I am wary that it could create a security hole, depending on how Django cleans form input data. The code to reuse the previous functionality was a little more complicated than it should have been because of the way I wrote the code from the week before, but I wanted to see if I could reuse it without changing it.
I spent over an hour trying to get Django working with Google app engine. I quit while I was ahead because the proposed solutions I found looked hacky or required another package to install. It seemed pretty painful compared to other hosting sites like no.de or heroku. To Google app engine’s credit, the logging was good.
This week we explored Python’s functional programming side. I wrote a website that graphs word statistics for a given page. The logic for gathering the statistics was to be written without using looping constructs. I used a primitive form of TDD just by using the command line and the assert function in Python. Later I integrated the statistic logic into the web interface, this time I avoided trying to reuse previous code as it was rather different and I was running out of time.
It took about an hour to get a filtered list of words with a count and an hour to get rid of duplicates, limit the results to only ten words and find shortest and longest words. To hook it all up to the web interface was another hour. A lot of the time was spent re-reading the documentation and head scratching.
By the way I ran the tool over my blog and I use the word ‘I’ a lot.