This isn't a hardware project, in fact it could be done on any internet connected computer capable of running Python. What the Pi brings me though is the ability to leave a computer on 24/7 processing feeds without using significant amounts of power.
It uses the Natural Language Toolkit (http://nltk.org/), a very powerful set of computational linguistic libraries and a fascinating set of toys in their own right. I'm writing about it here bacause I'm sure other people will want to use the NLTK on the Pi, and though the installation process isn't arduous I couldn't find any FAQs or HOWTOs about it written with respect to the Pi.
So as a first post, here's how to install NLTK on your Pi. I'm using the currently downloadable Debian image, the process is not likely to be vastly different for other distributions with the exception of substituting your package manager for apt.
First port of call: http://nltk.org/install.html. The basic instructions are below, with my pi-specific observations.
- Open a prompt and type python -V to find out what version of Python is installed. On my Pi, the version is Python 2.6.
- Install Setuptools: Point your browser (in my case Midori) at http://pypi.python.org/pypi/setuptools and download the corresponding version of Setuptools (scroll to the bottom, and pick the filename that contains the right version number and which has the extension .egg).
In my case the file required was http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c11-py2.6.egg. I downloaded it and saved it in my home directory (/home/pi).
Install it by typing sudo sh Downloads/setuptools-...egg, giving the location of the downloaded file.
- Install Pip: run sudo easy_install pip
- Install Numpy: Now the NLTK page suggests running run sudo pip install numpy --upgrade, but sadly that failed on my distribution citing missing libraries. Fortunately it is available ready-compiled via apt, so I ran sudo apt-get install python-numpy
- Install NLTK: run sudo pip install nltk --upgrade
- Test installation: run python then type import nltk
After following the steps above I had a fully functional NLTK. (https://twitter.com/#!/thekeywordgeek/status/202876787301158912)
If you've followed these instructions and are wondering what you can do with NLTK, I suggest looking at http://nltk.googlecode.com/svn/trunk/doc/book/ch01.html and skipping to "Getting started with NLTK".