Precision, Recall, and AUC-ROC curve
What is precision and recall? What are metrics used to test NLP Model ?
Precision is the ratio of true positive instances and the total number of positively predicted instances.
Formula:
Precision = TP/ ( TP + FP)
= True Positive / Total Predicted Positive
Recall ( Or TPR or Sensitivity) is the ratio of true positive instances and total number of actual positive instances.
Formula:
Recall = TP / (TP + FN)
= True Positive / ( Total Actual Positive)
Specificity = TN/ (TN + FP)
FPR = 1 - Specificity
F1 Score = 2 * (Precision * Recall) / ( Precision + Recall)
- F1 score evaluates the weighted average of recall and precision.
- F1 score consider both false negative and false positive instances while evaluating the model.
- F1 score is more accountable than accuracy for an NLP model when there is an uneven distribution of class.
image source : http://corysimon.github.io/articles/classification-metrics/
What is the AUC - ROC Curve?
AUC - ROC curve is a performance measurement for the classification problems at various threshold settings.
ROC is a probability curve and AUC represents the degree or measure of separability.
It tells how much the model is capable of distinguishing between classes.
Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1.
Example : Higher the AUC, the better the model is at distinguishing between patients with the disease and no disease.