Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Three dimensional indexed variables - How to make a loop just for the index in the middle of the variable

    Dear Colleagues,

    I have a data set with variables that are named according to three indices: "country_#_option_#_occurrence_#" plus other variables (id country country_id date).

    I have prepared a short example as follows:

    Code:
    clear
     
    input byte(id country country_id) float date byte(co_1_op_1_oc_1 co_1_op_1_oc_2 co_1_op_2_oc_1 co_1_op_2_oc_2 co_1_op_1_oc_3
    
    co_2_op_1_oc_1 co_2_op_1_oc_2 co_2_op_1_oc_3 co_2_op_2_oc_1 co_2_op_2_oc_2 co_2_op_2_oc_3)
    
     1 1 1 20093 1 0 1 1 0 0 0 0 0 0 1  
     2 1 2 20095 0 1 0 0 0 0 0 1 0 0 0
     3 1 3 20103 1 0 0 0 1 0 0 0 0 0 0  
     4 2 1 20124 1 1 1 0 0 1 0 0 1 0 0  
     5 2 2 20129 0 1 1 0 0 0 0 0 0 1 0  
     6 2 3 20131 0 0 1 0 0 0 0 0 0 0 0  
     7 2 4 20137 1 0 1 0 0 0 1 0 0 0 0  
    
    end
    format %td date
    My interest is related with counting the number of variables for each type of option (i.e. total variables per option_1 , total per option_2 , etc.).
    I do not want to reshape the data set. My idea is doing something like the following code (where "??" states for the code that I do not know what should it be):

    Code:
    local co_list co_1 co_2
    local op_list op_1 op_2
    local oc_list oc_1 oc_2 oc_3
    
      foreach i in `op_list' {
        ds ??_`i'_??
        local total_op_`i' = `:word count `r(varlist)''
         }
    Any idea on how to proceed?

    Thank you very much!

  • #2
    If the variable names follow exactly the patterns you show in the example, then this will do it:

    Code:
    clear
     
    input byte(id country country_id) float date byte(co_1_op_1_oc_1 co_1_op_1_oc_2 co_1_op_2_oc_1 ///
        co_1_op_2_oc_2 co_1_op_1_oc_3 co_2_op_1_oc_1 co_2_op_1_oc_2 co_2_op_1_oc_3 ///
        co_2_op_2_oc_1 co_2_op_2_oc_2 co_2_op_2_oc_3)
    
     1 1 1 20093 1 0 1 1 0 0 0 0 0 0 1  
     2 1 2 20095 0 1 0 0 0 0 0 1 0 0 0
     3 1 3 20103 1 0 0 0 1 0 0 0 0 0 0  
     4 2 1 20124 1 1 1 0 0 1 0 0 1 0 0  
     5 2 2 20129 0 1 1 0 0 0 0 0 0 1 0  
     6 2 3 20131 0 0 1 0 0 0 0 0 0 0 0  
     7 2 4 20137 1 0 1 0 0 0 1 0 0 0 0  
    
    end
    format %td date
    
    local co_list co_1 co_2
    local op_list op_1 op_2
    local oc_list oc_1 oc_2 oc_3
    
    foreach i in `op_list' {
        ds *_`i'_*
        local total_op_`i' = `:word count `r(varlist)''
        display `total_op_`i''
    }
    Now if there are extraneous variables in your data set that contain _op_1_ or _op_2_ in their names then you would have to make it a bit more specific:

    Code:
    ds co_*_`i'_oc_*
    If even that more restricted use of wildcards brings in extraneous variables, then I think you are going to have to just rename some variables to get around that.

    Comment


    • #3
      Thank you Clyde, your responses are always very helpful!

      Have a nice day to you and all Stata list colleagues.

      Comment

      Working...
      X