Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • KNN post estimation: probabilities of classification

    Hi statalisters,

    I conducted an out-of-sample KNN prediction that predicts poverty status based on some variable: x (code pasted below).

    Code:
    * generate a simple dataset
      clear all
      set seed 2876
      set obs 20
      
      gen x = round(runiform(0,100),1)
      gen y = round(runiform(0,100),1)
      gen r = round(runiform(0,1),1)
    
    * create a binary categorical income var
      gen poor = 1 if y < 20
      replace poor = 0 if y > 20
      label define poor_lab 0 "non-poor" 1 "poor"
      label values poor poor_lab
      
    * run knn
      discrim knn x if r == 0, group(poor) k(3)
      estat list if r == 1
    The post estimation command -estat list- gives the probability of classifying each observation as "poor" or "non-poor". I want to create a variable that is equal to these probabilities. That is, running this code, I want a variable called prob_nonpoor equal to 1 in observation 2, 1 in observation 3, .20 in observation 4, and so on... It doesn't seem that stata stores this information though.

    I'd appreciate any help, thank you!
Working...
X