Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/ultrasaurus/ml_class

code for the ml class
https://github.com/ultrasaurus/ml_class

Last synced: 3 months ago
JSON representation

code for the ml class

Lists

README

        

Machine Learning
================

class taught by [Hilary Mason](http://www.hilarymason.com/)

Install
--------

* JSONView Chrome extension
* Python (2.5, 2.6 or 2.7). Note forom scipy.org: NumPy installer should be used with the Python from http://python.org, not with Apple Python. These two are indeed incompatible, for one the python.org version is 32-bit while Apple version is 64-bit. Apple is also way behind on security updates, so normally python.org is the way to go.
* [NLTK](http://www.nltk.org/download)
* [NumPy](http://sourceforge.net/projects/numpy)
* pycluster
* hcluster

Verify your install (the commands below after '$' should be typed at the command prompt, the rest is sample output)

$ python --version
Python 2.6.1

any version of 2.5, 2.6 or 2.7 is fine. 3.0 is not going to work.

$ python
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>>

If you don't get an error, it means it worked. Type exit() to leave the interactive python console.

References
----------

* [Peter Skomoroch's dataset Bookmarks](http://www.delicious.com/pskomoroch/dataset)
* [Data Source Handbook by Pete Warden](http://oreilly.com/catalog/0636920018254)

Classifying Web Documents
-------------------------

Register for an API key at http://developer.nytimes.com/apps/register and select "Article Search API"

![Article Search API](https://img.skitch.com/20110731-b79yqahpf58d43ss2grqi1pi8.png)

example command-line:
curl "http://api.nytimes.com/svc/search/v1/article?query=jazz&api-key="

python nytimes_pull.py

creates two files "arts" and "sports"

Naive Bayes Clasifier

python