Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • variable already defined error in the foreach loop

    Hello,
    I am trying to make a use of the Python-Stata interface in Stata foreach loop. My loop is shown below:

    Code:
    foreach a1 of local alphaGrid {
    
    python: a = float(Macro.getLocal("a1"))
    
    // predict using the best value for alpha
    python: mnb = MultinomialNB(alpha = a, class_prior = None, fit_prior = True)
    
    // model Accuracy, how often is the MultinomialNB classifier correct?
    python: Y_mnb_pred = mnb.fit(X_train, np.ravel(Y_train)).predict(X.iloc[:, :-1])
    
    // right now Y_mnb_best_pred is a numpy array; so change it into a list
    python: Y_mnb_pred = Y_mnb_pred.tolist()
    
    // transfer the python variable Y_mnb_best_pred as the Stata variable 'yBestPred'
    python: Data.setObsTotal(nobs)
    python: Data.addVarFloat('yPred')
    python: Data.store(var = 'yPred', obs = None, val = Y_mnb_pred)
    
    generate correct yPred == category
    tabulate correct
    summarize correct
    `accuracy' = r(mean)
    
    replace Accuracy = `accuracy' in `i'
    replace Alpha = `a1' in `i'
    
    // update the counter i
    local i = `++i' 
    drop yPred 
    
    }
    and this is generating an error, the error message is "variable yPred already defined".
    I thought I would be fine with this since I am doing -drop yPred- at the end of the loop?
    How can I fix this error?

    Thank you,

  • #2
    Your code has an error that will cause it to fail the first time through the loop just after creating yPred, which then is not dropped because you don't get to the bottom of the loop. So when you rerun your code to try to fix the problem, the left-over yPred already exists.

    This corrects the problem I mentioned.
    Code:
    local accuracy = r(mean)

    Comment


    • #3
      hello,

      Thank you for your reply.

      When I execute -clear- before I re-run the loop, and after changing -`accuracy' = r(mean)- to -local accuracy = r(mean)-, I am still given the same error message "variable yPred already defined"....
      Is there something else that is wrong with my loop? Thank you,

      Comment


      • #4
        apparently the codes run to the error when -generate correct yPred == category- is ran for the first time. How should I fix this line? Thank you,

        Comment


        • #5
          Code:
          generate correct yPred == category
          should probably be
          Code:
          generate correct = ( yPred == category )

          Comment


          • #6
            Thank you!

            Comment


            • #7
              I suggest moving the python code to a separate py file. This also make it easy to spot the Stata syntax errors:
              Code:
              local i = 0
              
              foreach a1 of local alphaGrid {
              
                  local ++i    
                  capt drop yPred
              
                  python script mnb_pred.py  /* creates Stata variable yPred */
                  
                  generate correct yPred == category
              
                  tabulate correct
                  
                  summarize correct
                  `accuracy' = r(mean)
              
                  replace Accuracy = `accuracy' in `i'
                  
                  replace Alpha = `a1' in `i'    
              }
              In addition, the variable name "correct" might be changed to "correct_prediction" etc. The local accuracy seems unnecessary:
              Code:
              su correct_prediction , meanonly
              replace Accuracy = r(mean) in `i'

              Comment

              Working...
              X