I have an unbalanced panel dataset that goes from 1999 to 2019. I would like to replace the values of the variable time_distance (for all years) by the highest value of the variable time_distanceM (which is available for 2013 and 2016 only). I don't want to apply this to missing values, but only for individuals that actually have those observations. Thank you.
Code:
input long idpers int year byte(time_distanceM time_distance) 4101 1999 . . 4101 2000 . . 4101 2001 . . 4101 2002 . . 4101 2003 . . 4101 2004 . . 4101 2005 . . 4101 2006 . . 4101 2007 . . 4101 2008 . . 4101 2009 . . 4101 2010 . . 4101 2011 . . 4101 2012 . . 4102 1999 . . 4102 2000 . . 4102 2001 . . 4102 2002 . . 4102 2003 . . 4102 2004 . . 4102 2005 . . 4102 2006 . . 4102 2007 . . 4102 2008 . . 4102 2009 . . 4102 2010 . . 4102 2011 . . 4102 2012 . . 4103 1999 . . 4103 2000 . . 4103 2001 . . 4103 2002 . . 4103 2003 . . 4103 2004 . . 4103 2005 . . 4103 2006 . . 4103 2007 . . 4103 2008 . . 4103 2009 . . 4103 2010 . . 4103 2011 . . 4103 2012 . . 4104 1999 . . 4104 2000 . . 4104 2001 . . 4104 2002 . . 4104 2003 . . 4104 2004 . . 4104 2005 . . 4104 2006 . . 4104 2007 . . 4104 2008 . . 4104 2009 . . 4104 2010 . . 4104 2011 . . 4104 2012 . . 4105 1999 . . 4105 2000 . . 4105 2001 . . 4105 2002 . . 4105 2003 . . 4105 2004 . . 4105 2005 . . 4105 2006 . . 4105 2007 . . 4105 2008 . . 4105 2009 . . 4105 2010 . . 4105 2011 . . 4105 2012 . . 5101 1999 . 2 5101 2000 . 2 5101 2001 . 2 5101 2002 . 2 5101 2003 . 2 5101 2004 . 2 5101 2005 . 2 5101 2006 . 2 5101 2007 . 2 5101 2008 . 2 5101 2009 . 2 5101 2010 . 2 5101 2011 . 2 5101 2012 . 2 5101 2013 2 2 5101 2014 . 1 5101 2015 . 1 5101 2016 1 1 5101 2017 . . 5101 2018 . . 5101 2019 . . 5102 2005 . . 5102 2006 . . 5103 2008 . 2 5103 2009 . 2 5103 2010 . 2 5103 2011 . 2 5103 2012 . 2 5103 2013 2 2 5103 2018 . . end
Comment