# evaluating a learning algorithm

## deciding what to try next

### debugging a learning algorithm

Unacceptably large errors in its predictions.

- get more training examples - fix high variance
- try smaller sets of features - fix high variance
- try getting addtional features - fix high bias
- try adding polynomial features - fix high bias
- try decreasing - fix high bias
- try increasing - fix high variance

### diagnostic

A test that you can run to gain insight what is/isn’t working with a learning algorithm, and gain guidance as to how best to improve its performance.

Diagnostics can take time to implement, but doing so can be a very good use of time.

## evaluating a hypothesis

training set: 70% (better randomly shuffled)

test set: 30%

### training/testing procedure for linear regression

- learning parameter from training data (minimizing training error )
- compute test set error (cost function)
- get missclassification error (percentage of wrong predictions if classification problem)

## model selection and training/validation/test sets

### model selection

d: what degree of polynomial to choose for hypothesis

calculate the test set error for different degrees of polynomial cnd choose the one with minimum error

### evaluating hypothesis

- training set: 60%
- cross validation set: 20%
- test set: 20%

Test parameters with different degree of polynomial on cross validation set. Estimate generalization error on the test set

# bias vs. variance

## diagnosing bias vs. variance

High bias: underfit. high training error and high validation error

High variance: overfit. low training error and much high validation error

## regularization and bias/variance

regularization parameter

- large : high bias (underfit)
- small : high variance (overfit)

Define the cost function of training, validation and test sets without regularization terms.

- Create a list of

(i.e. ); - Create a set of models with different degrees or any other variants.
- Iterate through the s and for each go through all the models to learn some .
- Learn the parameter for the model selected, using with the selected.
- Compute the train error using the learned (computed with ) on the without regularization or .
- Compute the cross validation error using the learned (computed with ) on the without regularization or .
- Select the best combo that produces the lowest error on the cross validation set.
- Using the best combo and , apply it on to see if it has a good generalization of the problem.

## learning curve

Plot or vs training set size m.

While m increases:

- increasing
- decreasing

While bias is high:

- The final errors for both training and validation will be high and similar
- Getting more training data will not help much

While variance is high:

- large gap between final errors for training and validation sets but approach each other while m increases
- getting more training data is likely to help

## neural network

- small neural network: fewer parameters; more prone to underfitting; computationally cheaper
- large neural network: use regularization to address overfitting

precision = (true positives)/(no. of predicted positive)

recall = (true positives)/(no. of actual positive)

use F1 score formula to evaluate algorithms: