A test has been applied to 1010 patients to identify whether a rare disease is present. The following confusion matrix gives the result of the test. Predicted Yes No Total Actual Yes 7 3 10 No 73 927 1000
Total 80 930 1010 (a) From the table above, identify True Positive (TP), False Negative (FN), False Positive (FP) and True Negative (TN) and then calculate the precision, recall, F₁ score and false positive rate. (b) The total classification error is simply the total number of misclassified items. (i) Write down the formula for the total misclassification error in terms of the entries in the confusion matrix (TP, FN, FP, TN).
(ii) Consider a naive classifier C₁ that classifies everything as positive. What will the confusion matrix look like when you use C₁ to make predictions on this dataset?
(iii) Similarly, consider another naive classifier C₂ that classifies everything as negative. Write down the corresponding confusion matrix for this classifier on the dataset.
(iv) Write down the total classification errors of the two naive classifiers.
(v) Is classification error itself a good measure of performance in this case and why?