Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a categorical variable from multiple numeric variables

    I apologize if this is duplicative inquiry. I am working with the 2017-2018 Area Health Resource File which is county data. I reshaped the the data set from in long layout, and it have been set up for longitudinal (panel) data analysis. For this question, I have looked at several previous entries and have not been able to find one that provides an answer to my question.

    I have 3 separate variables for education created in my dataset which is the number of non-veterans in a county that had either a high school, some college or college or more education from 2012-2016. I am hoping to create a categorical variable called "educ" that would be a combination of these three variables.

    I tried using the following format for generating a new variable but i haven't had any luck:

    gen educ=1 if nonvetedu_hs>1
    replace educ=2 if nonvetedu_hsplus>1
    replace educ=3 if nonvetedu_college>1
    replace educ=4 if nonvetedu_hs<. & nonvetedu_hsplus<. & nonvetedu_college<.

    My goal is to create categorical variables for similar other variables (race, gender, income, etc) which are all formatted the same way so that I can use it in regression model. I included the three variables of interest plus the year variable. I previously reshaped from wide to long and the data for 2013-2016 is missing. Not sure how I fix this (or if i need to fix this) for the model ...but I will follow up on that in a separate post.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(nonvetedu_college nonvetedu_hsplus nonvetedu_hs) byte year
        .      .     . 10
        .      .     . 11
     6783  26851  4274 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
    35408 107434 13014 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     2074  12031  4616 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     1667  11594  2881 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     4499  27648  7355 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
      672   4530  2397 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     1996  10117  2482 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
    11694  55134 13080 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     2574  17094  4404 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     2293  13272  3196 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
        .      .     . 11
     3855  21120  5630 12
        .      .     . 13
        .      .     . 14
        .      .     . 15
        .      .     . 16
        .      .     . 17
        .      .     . 18
        .      .     . 10
    end
    Thank you in advance

  • #2
    Well, you show some code that, I suppose, didn't give you what you want. Since that code would just produce educ = 4 in the rows with no missing values, and 3 in the rows that are just missing values., I'm not surprised you don't like it --it doesn't seem like a very useful variable.

    But what do you want this variable to look like? It's one thing to combine a bunch of dichotomous variables into a single categorical variable. But it isn't at all clear how you want to combine a bunch of count variables into a single categorical variable. So try explaining what you want the result to look like and mean, and also provide some hand-calculated results for some illustrative cases.

    Comment


    • #3
      Clyde Schechter Thank you for your response! If you recall in a previous post (post #20), you provided guidance on creating dichotomous variable to identify counties if they ever had an federal qualified health center (FQHC) and then the variable "wanted" to be 1 for any county that started out with no FQHC but at some point in time did get one (or more). And it will be 0 in the other circumstances, i.e. if it either started out with at least 1 FQHC, or it started out with none and never got one later.

      Now that I have those variables created, I wanted to look at the descriptive statistics in counties where "wanted" is 1 and compare them to counties where "wanted" is 0. I envision a descriptive descriptive table identify the counties stratified by wanted 1 and 0.

      After providing the descriptive statistics, I wanted to determine the magnitude of change in visits to ERs, provider presence, medical specialists present in counties when a FQHC appears controlling for race, gender, income, insurance status, and age. Still working through the outcome variables because I worry about confounding between controlled variables and outcome variables.

      I have read some of the other posts and derived that I need to consider Poisson or possibly I could do logistic regression for my models that include count data, however I am not sure I am there yet. Or I can be completely off and confused. Here is the variables as a sample:

      ​​​​​​​
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long(nonvetedu_hs nonvetedu_college nonvetedu_hsplus) float(wanted ever_had_fqhc n_county)
          .     .      . 0 1  1
          .     .      . 0 1  1
       4274  6783  26851 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  1
          .     .      . 0 1  2
          .     .      . 0 1  2
      13014 35408 107434 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  2
          .     .      . 0 1  3
          .     .      . 0 1  3
       4616  2074  12031 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  3
          .     .      . 0 1  4
          .     .      . 0 1  4
       2881  1667  11594 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  4
          .     .      . 0 1  5
          .     .      . 0 1  5
       7355  4499  27648 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  5
          .     .      . 0 1  6
          .     .      . 0 1  6
       2397   672   4530 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  6
          .     .      . 0 1  7
          .     .      . 0 1  7
       2482  1996  10117 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  7
          .     .      . 0 1  8
          .     .      . 0 1  8
      13080 11694  55134 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  8
          .     .      . 0 1  9
          .     .      . 0 1  9
       4404  2574  17094 0 1  9
          .     .      . 0 1  9
          .     .      . 0 1  9
          .     .      . 0 1  9
          .     .      . 0 1  9
          .     .      . 0 1  9
          .     .      . 0 1  9
          .     .      . 1 1 10
          .     .      . 1 1 10
       3196  2293  13272 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 10
          .     .      . 1 1 11
          .     .      . 1 1 11
       5630  3855  21120 1 1 11
          .     .      . 1 1 11
          .     .      . 1 1 11
          .     .      . 1 1 11
          .     .      . 1 1 11
          .     .      . 1 1 11
          .     .      . 1 1 11
          .     .      . 0 1 12
      end
      label values n_county n_county
      label def n_county 1 "Alabama Autauga", modify
      label def n_county 2 "Alabama Baldwin", modify
      label def n_county 3 "Alabama Barbour", modify
      label def n_county 4 "Alabama Bibb", modify
      label def n_county 5 "Alabama Blount", modify
      label def n_county 6 "Alabama Bullock", modify
      label def n_county 7 "Alabama Butler", modify
      label def n_county 8 "Alabama Calhoun", modify
      label def n_county 9 "Alabama Chambers", modify
      label def n_county 10 "Alabama Cherokee", modify
      label def n_county 11 "Alabama Chilton", modify
      label def n_county 12 "Alabama Choctaw", modify

      Comment


      • #4
        Well, this is a little bit clearer, but still leaves a great deal to my imagination, and the latter continues to fail me.

        It seems that you have multiple observations per county, I guess corresponding to different years since the context of this study is a longitudinal one. But strangely, the education variables you show have non-missing values only in the third year. Maybe that's because they come from the census or something like that? Is it your intention that these values should also apply to the other years in your study?

        It also seems that the unit of analysis in your study is the county. Now, counties do not have attributes like sex, race, or education. So you cannot create dichotomous variables corresponding to discrete characteristics like that. What you can do is provide variables that describe the distributions of such variables in each county. So for example, you could create three variables that give the proportion of all county residents in each educational group:

        Code:
        egen educ_total = rowtotal(nonvetedu_*)
        foreach v of varlist nonvetedu_* {
            gen prop_`v' = `v'/educ_total
        }
        These variables are somewhat analogous to indicator variables for discrete categorizations of individuals. In fact, just like indicator variables for the levels of an individual characterization, they add up to 1, so when used in a regression model, they will be colinear with the constant term and one will be omitted. But you can, in most respects, use these newly created variables as if they were indicator variables. (But if you use factor variable notation, they get the c. prefix, not the i. prefix.)

        There is, however, no way to combine these into a single multi-level categorical variable.

        Added: And you can contrast the distributions of these attributes between your two categories of counties easily enough:
        Code:
        tabstat prop_nonvetedu_*, by(wanted)
        Is this what you had in mind?
        Last edited by Clyde Schechter; 22 Jan 2020, 19:32.

        Comment


        • #5
          Hi Clyde,

          Thank you for trying to make sense of my garble. Let me try to provide further explanations by answering your questions.

          It seems that you have multiple observations per county, I guess corresponding to different years since the context of this study is a longitudinal one. But strangely, the education variables you show have non-missing values only in the third year. Maybe that's because they come from the census or something like that? Is it your intention that these values should also apply to the other years in your study?
          So the original dataset was wide and i reshaped it too long by the year and county variable. I wanted to identify all of the variables by year. I think that is why the education variable has some years with missing data. The original variable in long form was the number of persons 25+ with hs/some college/college plus education 2012-2016. Below pasted some of the sample of data in the long form. I think when i reshaped it, it only put the variable at 2012 instead of possible repeating the variable for all the years 2012-2016. You are correct that they come 2012-2016American Community Survey (ACS) Summary File, U.S. Census Bureau. So you are correct it is my intention that should apply to the other years.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long(nonvetedu_hs12 nonvetedu_hsplus12 nonvetedu_college12)
            3443  12026   2093
            9375  27514   4083
            4227  16423   3790
           14106 237705  92872
            2830   8726   1946
            1238  10857   3789
             278   4359    786
            2769  10062   1681
            8001  54705  13367
            1803  12001   1797
            3291  37891   9656
            4054  16520   3743
            3869  13420   1829
             192   2304    427
           52632 224316  61594
              57   1461    446
            3518   6345   1294
            1689  16585   4215
             286   2141    571
            3089  16145   3089
            1705  21229   8295
            4716   7691   2270
            9567  17495   5300
           11646  25578   8828
            6266  11759   2637
            4314  12122   3366
           15898  84505  24741
             852   9349   1669
           10970 126988  57977
           15384  79731  21043
          138370 917949 468876
            1451   6815   2080
           13662 174314  75804
             766  17639   9021
            5651  57349  32524
             797   6186   1034
            4568  18730   3726
             390   2035    281
             513   3337    601
            4884  19078   3080
            1163   3126    405
            9683  93265  62869
             424   3287    789
             725   5600   1141
            1081   7684   1541
            7005  63183  15172
            3046  23450   5650
            4911  39873   8136
            1422   5780   1295
            1670   8854   1688
           48322 749787 322494
             739   6914   1390
            3121   9335   1671
            6921  55238  11086
            3462  12831   1953
           23713 190936  59971
            1683   4485    607
            1848  16738   3221
              54    621    218
            3005  21580   5177
            1549   6355   1137
            3431  16542   3893
            2071   6604   1072
            6238  12899   3118
           12056 147724  55047
            7891  29678   4177
            7200  40228  11271
             479   4292    894
           19742  98195  23237
            1606  11806   2558
             987   9664   2584
            2625   6819    889
            7075  58058  13571
           10862  40223   7916
           28552 298452 127286
           14015 195021  60963
            3420  12946   1556
             350   3659    749
            1441  13862   4089
            9203  31421   4300
             821   7174   1468
            2766   8394   1350
            1617   7838   1905
            2979  12580   2767
           30523 341741 151760
             532   4729   1146
             766   7057   2736
           16265  44751  12759
            1423   8656   1180
            2029   9495   1623
           10235 145190 113993
            4750  39006   7062
             130   1023    283
            5863  38636   8164
            3107   9172   2069
              13    241     70
            8413  61013  18508
            3218  15303   3581
            4178  27286   6234
             796   8749   2121
          end

          It also seems that the unit of analysis in your study is the county. Now, counties do not have attributes like sex, race, or education. So you cannot create dichotomous variables corresponding to discrete characteristics like that. What you can do is provide variables that describe the distributions of such variables in each county. So for example, you could create three variables that give the proportion of all county residents in each educational group.
          So I am thinking maybe my unit of analysis is incorrect. I want to study the change of attributes in the county (sex, race, education, income etc) due to the absence or presence of an FQHC. Also if there is any change in a county that never had a FQHC across the same time period? Using those variables I would measure the change. I guess in thinking critically (actually thinking about this) my unit of analysis or the "who" or "what" i am studying is the FQHC. It is the thing that at the end of the analysis I want to be able to say if it has any magnitude or impact to the communities they exist in. Since I do not have community (zip code) level data, I want my unit of observation or where I will make the observations is at the county level. That being said, I think my desire is still to create control groups (counties with no FQHCs ever and counties that had it at one point but not continuously) and treatment group (counties that went from not having to have at least 1 or more) and observe within the counties attributes sex, race, education etc...but measure the change in the income, provider type, ER visits when controlling for those attributes.

          Hope that is not too confusing

          Comment


          • #6
            Well, the designation of counties as treatment or control based on the acquiring of a new FQHC or not was already taken care of in an earlier thread, if I recall.

            But what is unclear now is what to use for your demographic variables. At the county level there is simply no reasonable way to turn them into categories. Demographic descriptions of aggregate populations are based on proportions in each category of sex, race, ethnicity, income, education, etc. Less commonly, the actual head counts are used rather than the proportions, but that actually gets very complicated. I think your starting point is to calculate the proportions as I suggested in #4. From there it is a bit less clear. To see if the sex distribution changes you can use the proportion of females (or the proportion of males, it doesn't matter as they are complementary) as the outcome in a regression with the treatment/control variable as the predictor. For multi-level demographic attributes it's a bit harder. But the -mvreg- command is probably suitable here. -help mvreg- Basically it lets you simultaneously regress the proportions in each level of race (or income, or education) against the treatment/control variable.

            Comment


            • #7
              You are correct. The designation of counties as treatment or control based on acquiring of a new FQHC was taken care of in an earlier thread. I will give the mvreg command a try.
              Last edited by Rene Natasha; 25 Jan 2020, 10:51.

              Comment


              • #8
                In thinking through what you mentioned in #4:
                It seems that you have multiple observations per county, I guess corresponding to different years since the context of this study is a longitudinal one. But strangely, the education variables you show have non-missing values only in the third year. Maybe that's because they come from the census or something like that? Is it your intention that these values should also apply to the other years in your study?
                I am not sure what I would need to do to make sure they apply to the other years in which they are showing up as missing.

                Comment


                • #9
                  Please post example data like what you showed in #2, but including the year variable.

                  Comment


                  • #10
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input int vetedu_hs long(vetedu_hsplus vetedu_college) float(wanted ever_had_fqhc) byte year float(prop_nonvetedu_hs prop_nonvetedu_hsplus prop_nonvetedu_college)
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    254  4778 1938 0 1 12 .11274665 .7083201  .1789332
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    942 18324 5881 0 1 12 .08350015 .6893158 .22718407
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    208  1532  292 0 1 12 .24656802 .6426473 .11078468
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    159  1111  218 0 1 12  .1784785 .7182505 .10327097
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    527  3933  652 0 1 12  .1861931 .6999139 .11389297
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                     55   351   80 0 1 12  .3154362 .5961311  .0884327
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    160  1172  247 0 1 12 .17005824 .6931826 .13675916
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    814  9421 2064 0 1 12 .16368824 .6899685 .14634329
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                      .     .    . 0 1 11         .        .         .
                    304  2028  404 0 1 12 .18295115 .7101197 .10692921
                      .     .    . 0 1 13         .        .         .
                      .     .    . 0 1 14         .        .         .
                      .     .    . 0 1 15         .        .         .
                      .     .    . 0 1 16         .        .         .
                      .     .    . 0 1 17         .        .         .
                      .     .    . 0 1 18         .        .         .
                      .     .    . 1 1 10         .        .         .
                      .     .    . 1 1 11         .        .         .
                    303  1923  317 1 1 12  .1703534  .707425 .12222163
                      .     .    . 1 1 13         .        .         .
                      .     .    . 1 1 14         .        .         .
                      .     .    . 1 1 15         .        .         .
                      .     .    . 1 1 16         .        .         .
                      .     .    . 1 1 17         .        .         .
                      .     .    . 1 1 18         .        .         .
                      .     .    . 1 1 10         .        .         .
                      .     .    . 1 1 11         .        .         .
                    288  2682  562 1 1 12 .18395688 .6900833 .12595981
                      .     .    . 1 1 13         .        .         .
                      .     .    . 1 1 14         .        .         .
                      .     .    . 1 1 15         .        .         .
                      .     .    . 1 1 16         .        .         .
                      .     .    . 1 1 17         .        .         .
                      .     .    . 1 1 18         .        .         .
                      .     .    . 0 1 10         .        .         .
                    end
                    Last edited by Rene Natasha; 26 Jan 2020, 13:51.

                    Comment


                    • #11
                      The code below assumes (and verifies) that within each county the variables other than wanted, ever_had_fqhc, and year, have only a single non-missing value, which is to be carried over to all years for that county. (If there are other variables that are not to be carried over, you should add them to the varlist of the -ds- command.) Also, this example data did not contain the n_county variable, so I created one (with fictitious values) in this code. But you should not do that part, and you should instead start where it says // BEGIN HERE, and use your real n_county variable.

                      Code:
                      //  THE EXAMPLE DATA DOESN'T INCLUDE THE COUNTY IDENTIFIER
                      //  SO I'M CREATING ONE ARTIFICIALLY HERE.  WITH YOUR REAL DATA
                      //  SKIP DOWN TO WHERE IT SAYS BEGIN HERE
                       gen n_county = sum(year == 10)
                       
                       // BEGIN HERE
                       ds wanted ever_had_fqhc year n_county, not
                       foreach v of varlist `r(varlist)' {
                          by n_county (`v'), sort: assert missing(`v') if _n > 1
                          by n_county (`v'): replace `v' = `v'[1]
                      }
                      sort n_county year

                      Comment


                      • #12
                        Thank you Clyde. I really appreciate it. Off to try this. Will keep you posted.

                        Comment


                        • #13
                          As a follow up, if i understand the command, for each variable not included in the -ds- command, the missing values for the years would be populated with the values across all the year. In other words, all values for vetedu_hs, vetedu_hsplus, vetedu_college would be carried forward for all years in which there no data for each county? Right now I only have values for the year 2012 for vetedu_hs, vetedu_hsplus, vetedu_college. The command would populate the 2012 values for all the other years in which the data is missing?

                          What if I only wanted the data to be populated from 2012-2016 or 2010-2013?

                          Comment


                          • #14
                            Your understanding of what the code does is correct.

                            If you wanted the data only to be populated from 2012-2016, it would be a slight modification:
                            Code:
                             
                            // THE EXAMPLE DATA DOESN'T INCLUDE THE COUNTY IDENTIFIER // SO I'M CREATING ONE ARTIFICIALLY HERE. WITH YOUR REAL DATA // SKIP DOWN TO WHERE IT SAYS BEGIN HERE gen n_county = sum(year == 10) // BEGIN HERE ds wanted ever_had_fqhc year n_county, not foreach v of varlist `r(varlist)' { by n_county (`v'), sort: assert missing(`v') if _n > 1 by n_county (`v'): replace `v' = `v'[1] if inrange(year, 2012, 2016) } sort n_county year

                            Comment


                            • #15
                              Thanks Clyde. Referring to #4, what purpose does var educ_total serve since I would be using the prop_nonvetedu_* to understand distribution of educational groups across a county with or without FQHC.

                              Referring to #14: For the foreach command, I am assuming that if i wanted to identify variables with different date ranges, I would repeat the foreach command and change the in range date?

                              However, for example, have several 300 variables of which only 15 require populating the missing data across years 2010 - 2016. The other variables can stay the same.

                              Comment

                              Working...
                              X