Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issues with drop within program

    Hello everyone,

    I am having issues with dropping a variable within the program I wrote. The code runs cluster analysis for different values of clusters. The prefix of the name of the generated grouping variables is given by the argument in the cluster_name. If the grouping variables already exists, I would like to drop them. I thought this was a straight forward task, but for some reason it does not work. The code drops all the variables, however it does not perform the cluster analysis as it tells me that the variable already exists. Why is this problem happening?

    Thank you very much,


    Code:
    capture program drop kmeans_eval
    program define kmeans_eval
        syntax varlist, kmax(integer) cluster_name(string) 
        local list2 `varlist' 
        
        forvalues k = 1(1)`kmax' {
            capture drop `cluster_name'`k'
        }
        
        forvalues k = 1(1)`kmax' {
            cluster kmeans `list2', k(`k') start(random(123)) name(`cluster_name'`k')
        }
    
    end

  • #2
    Please give some example data using the dataex command, and run the program in context such that it reproduces the error.

    In the meanwhile, as someone who regularly wrestles with ado code, do
    Code:
    set tr on
    to see where the issue is happening under the hood

    Comment


    • #3
      I am not a user of cluster analysis, so this isn't a full explanation. You'll need to read the documentation for cluster further to understand the implications of this solution. See especially the output of help cluster utility to get an idea of what cluster is doing.

      The problem is that if you have once run the cluster command with, say, cluster_name(fred) having created variables fred1, fred2, ... in your dataset, running
      Code:
          forvalues k = 1(1)`kmax' {
              capture drop `cluster_name'`k'
          }
      drops the fred1 fred2 ... variables, as you expect. But cluster actually seems to store some other "cluster objects" that are not deleted, and that is what leads to the error message. It's not complaining that the variables already exist, but rather that the "cluster objects" already exist.

      Changing that loop to
      Code:
          forvalues k = 1(1)`kmax' {
              capture cluster drop `cluster_name'`k'
          }
      resolves the problem, deleting the other results along with deleting the variables, as does the much simpler
      Code:
          cluster drop _all
      (no loop required) but of course it drops all cluster objects and variables, not just those with the same cluster_name as the current call to kmeans_eval provided.
      Last edited by William Lisowski; 16 Apr 2022, 12:52.

      Comment


      • #4
        Thank you very much!

        Comment

        Working...
        X