evaluating a learning algorithm
deciding what to try next
debugging a learning algorithm
Unacceptably large errors in its predictions.
- get more training examples - fix high variance
- try smaller sets of features - fix high variance
- try getting addtional features - fix high bias
- try adding polynomial features - fix high bias
- try decreasing λ - fix high bias
- try increasing λ - fix high variance
diagnostic
A test that you can run to gain insight what is/isn’t working with a learning algorithm, and gain guidance as to how best to improve its performance.
Diagnostics can take time to implement, but doing so can be a very good use of time.
evaluating a hypothesis
training set: 70% (better randomly shuffled)
test set: 30%
training/testing procedure for linear regression
- learning parameter θ from training data (minimizing training error J(θ))
- compute test set error (cost function)
- get missclassification error (percentage of wrong predictions if classification problem)
model selection and training/validation/test sets
model selection
d: what degree of polynomial to choose for hypothesis
calculate the test set error for different degrees of polynomial cnd choose the one with minimum error
evaluating hypothesis
- training set: 60%
- cross validation set: 20%
- test set: 20%
Test parameters with different degree of polynomial on cross validation set. Estimate generalization error on the test set
bias vs. variance
diagnosing bias vs. variance
High bias: underfit. high training error and high validation error
High variance: overfit. low training error and much high validation error
regularization and bias/variance
regularization parameter λ
- large λ: high bias (underfit)
- small λ: high variance (overfit)
Define the cost function of training, validation and test sets without regularization terms.
- Create a list of λ
(i.e. λ∈{0,0.01,0.02,0.04,0.08,0.16,0.32,0.64,1.28,2.56,5.12,10.24}); - Create a set of models with different degrees or any other variants.
- Iterate through the λs and for each λ go through all the models to learn some θ.
- Learn the parameter θ for the model selected, using Jtrain(θ) with the λ selected.
- Compute the train error using the learned θ (computed with λ ) on the Jtrain(θ) without regularization or λ=0.
- Compute the cross validation error using the learned θ (computed with λ) on the JCV(θ) without regularization or λ=0.
- Select the best combo that produces the lowest error on the cross validation set.
- Using the best combo θ and λ, apply it on Jtest(θ) to see if it has a good generalization of the problem.
learning curve
Plot Jtrain(θ) or JCV(θ) vs training set size m.
While m increases:
- increasing Jtrain(θ)
- decreasing JCV(θ)
While bias is high:
- The final errors for both training and validation will be high and similar
- Getting more training data will not help much
While variance is high:
- large gap between final errors for training and validation sets but approach each other while m increases
- getting more training data is likely to help
neural network
- small neural network: fewer parameters; more prone to underfitting; computationally cheaper
- large neural network: use regularization to address overfitting
precision = (true positives)/(no. of predicted positive)
recall = (true positives)/(no. of actual positive)
use F1 score formula to evaluate algorithms: 2PRP+R