15.071 | Spring 2017 | Graduate

The Analytics Edge

3.2 Modeling the Expert: An Introduction to Logistic Regression

3.2 Modeling the Expert: An Introduction to Logistic Regression

Quick Question

This question will ask about the following ROC curve:

Plot of receiver operator characteristic curve false vs. true positive rates.

Given this ROC curve, which threshold would you pick if you wanted to correctly identify a small group of patients who are receiving the worst care with high confidence?

   
   
   
   

Explanation The threshold 0.7 is best to identify a small group of patients who are receiving the worst care with high confidence, since at this threshold we make very few false positive mistakes, and identify about 35% of the true positives. The threshold t = 0.8 is not a good choice, since it makes about the same number of false positives, but only identifies 10% of the true positives. The thresholds 0.2 and 0.3 both identify more of the true positives, but they make more false positive mistakes, so our confidence decreases.

Which threshold would you pick if you wanted to correctly identify half of the patients receiving poor care, while making as few errors as possible?

   
   
   
   

Explanation The threshold 0.3 is the best choice in this scenerio. The threshold 0.2 also identifies over half of the patients receiving poor care, but it makes many more false positive mistakes. The thresholds 0.7 and 0.8 don't identify at least half of the patients receiving poor care.

Course Info

As Taught In
Spring 2017
Level
Learning Resource Types
Lecture Videos
Lecture Notes
Problem Sets with Solutions