Data Science Interview Preparation

Precision, Recall, and AUC-ROC curve

 What is precision and recall? What are metrics used to test NLP Model ?

Precision is the ratio of true positive instances and the total number of positively predicted instances.

Formula:

Precision = TP/ ( TP + FP)

               = True Positive / Total Predicted Positive


Recall ( Or TPR or Sensitivity) is the ratio of true positive instances and total number of actual positive instances.

Formula:

Recall = TP / (TP + FN)

           = True Positive / ( Total Actual Positive)


Specificity = TN/ (TN + FP)

           FPR = 1 - Specificity  


F1 Score = 2 * (Precision * Recall) / ( Precision + Recall)

  • F1 score evaluates the weighted average of recall and precision. 
  • F1 score consider both false negative and false positive instances while evaluating the model. 
  • F1 score is more accountable than accuracy for an NLP model when there is an uneven distribution of class.
Example : 






image source : http://corysimon.github.io/articles/classification-metrics/ 


What is the AUC - ROC Curve?

AUC - ROC curve is a performance measurement for the classification problems at various threshold settings.

ROC is a probability curve and AUC represents the degree or measure of separability. 

It tells how much the model is capable of distinguishing between classes.

Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1.

Example : Higher the AUC, the better the model is at distinguishing between patients with the disease and no disease.









'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })();
Theme images by Barcin. Powered by Blogger.