Winter 2015
02/09/2015
9:00 am: Arrivals
9:15 am: Error Analysis and Tree/Forest Challenges
10:15 am: SVMs
11:00 am: Challenges + Work on McNulty
12:00pm: Lunges
1:30pm: Work on McNulty
5:00pm: Departures
w5d1_SVMs.pdf (2.5 MB)
SVM math
A tutorial on SVMs
Another tutorial on SVMs
An Idiot's Guide to SVMs
SVM lecture
How to tune SVM Parameters
Preprocessing data in sklearn
SVMs in sklearn
RBF Kernel
We will go back to the original Supervised Learning Challenges.
For the house representatives data set, calculate the accuracy, precision, recall and f1 scores of each classifier you built (on the test set).
For each, draw the ROC curve and calculate the AUC.
Calculate the same metrics you did in challenge 1, but this time in a cross validation scheme with the cross_val_score function (like in Challenge 9)
For your movie classifiers, calculate the precision and recall for each class.
Draw the ROC curve (and calculate AUC) for the logistic regression classifier from challenge 12
Note: Uninstall pydot if you already installed it but it's not working
pip uninstall pydot
Otherwise, you can start here:
pip uninstall pyparsing
pip install -Iv
https://pypi.python.org/packages/source/p/pyparsing/pyparsing-1.5.7.tar.gz#md5=9be0fcdcc595199c646ab317c1d9a709
pip install pydot
brew install graphviz
Note: If you're trying to draw a tree and you get an error about not finding
dot_parser
Try the following and it should be fixed:
pip install pyparsing==1.5.7
Tree / Forest Challenges
For the house representatives data set, fit and plot a decision tree classifier
Fit and draw a decision tree classifier for your movie dataset
Tackle the Titanic Survivors kaggle competition with decision trees. Look at your splits, how does your tree decide?