Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poroducing Command file from stata data set

    Hello,

    I want to create a command file from three variables in my data


    I have created a command line but for showing up once for each county in my data , the command line gives me the following format My timeline of data expands from 2009 t0 2019. I have a policy in my variable which is implmented for county 1001 since 2016. The policy variable is binary : either takes 0 or 1. Like for the county 1001 the policy was implmented in year 2016 , that;s why in my panel data for county 1001 the plicy variable is 0 before 2016 but since 2016 it starts to become 1 for county 1001.

    S0, I need the command to show up for county 1001 and 1125 like as this :

    replace p_year = 2016 if county == 1001 .
    replace p_year = 2015 if county == 1125

    But instead my command file shows up like following

    Code:
    
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = . if county == 1001
    replace p_year = 2016 if county == 1001
    replace p_year = 2017 if county == 1001
    replace p_year = 2018 if county == 1001
    replace p_year = 2019 if county == 1001
    replace p_year = . if county == 1125
    replace p_year = . if county == 1125
    replace p_year = . if county == 1125
    replace p_year = . if county == 1125
    replace p_year = . if county == 1125
    replace p_year = 2014 if county == 1125
    replace p_year = 2015 if county == 1125
    replace p_year = 2016 if county == 1125
    replace p_year = 2017 if county == 1125
    replace p_year = 2018 if county == 1125
    replace p_year = 2019 if county == 1125
    My data looks like this

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float year double county float policy
    2014 1049 0
    2014 1051 0
    2014 1053 0
    2014 1055 0
    2014 1057 0
    2014 1059 0
    2012 1001 0
    2013 1001 0
    2014 1001 0
    2014 1061 0
    2014 1063 1
    2014 1065 1
    2014 1067 0
    2014 1069 0
    2014 1071 0
    2012 1125 0
    2013 1125 0
    2014 1125 1
    2014 4001 0
    2014 4012 0
    2014 4013 1
    2014 4015 0
    2015 1001 1
    2016 1001 1
    2017 1001 1
    2015 1125 1
    2016 1125 1
    2017 1125 1
    2018 1125 1
    2019 1125 1
    2017 1001 1
    2018 1001 1
    2019 1001 1
    end
    The following is the code i used to produce the command file out of my data sample presented above

    Code:
    * Sort the dataset by county and year
    sort county year
    
    * Create a new variable to identify the first occurrence of p being 1 for each county
    gen first_p1 = .
    
    * Loop over each county
    levelsof county, local(counties)
    foreach county of local counties {
        * Find the first occurrence of p being 1 for the current county
        quietly replace first_p1 = year if county == `county' & policy == 1 & first_p1 == .
    }
    
    * Generate the command to replace the p_year variable
    gen command = "replace p_year = " + string(first_p1) + " if county == " + string(county)
    
    * Save the command as a command file
    outfile command using p_did.do, noquote replace
    Last edited by Tariq Abdullah; 22 Apr 2024, 13:17.

  • #2
    I think this approach is much more complicated than it needs to be. If I understand correctly, you want to create a variable p_year whose value, for any given county, is the earliest year in which that county had the policy in effect. I don't see any need to create a command file to do this. If that's right, all you need is a one-line command:
    Code:
    by county (year), sort: egen p_year = min(cond(policy, year, .))
    Now, perhaps you need to create this p_year variable, consistently, in several different files, some of which might not contain all the necessary ingredients. In that case, after doing the above, you can do this:

    Code:
    keep county p_year
    duplicates drop
    isid county, sort
    save policy_years, replace
    Then, to bring the p_year variable into some other file:
    Code:
    use some_other_file, clear
    merge m:1 county using policy_years, keep(master match)

    Comment


    • #3
      Thank you so much ! Again, you saved me a tons of trouble of getting the correct code to get something I've been trying to do but was failing to execute properly. Much appreciated !

      Comment

      Working...
      X