Let's say I have a dataset containing a thousand variables that have only one nonmissing entry per id. Up till now, I have used the following Stata code to fill in the missing entries, which are set to the variable's unique observation (by id).
Unfortunately, this procedure is quite slow, especially when (very) large datasets are employed. I was wondering whether it would make sense to rewrite this snippet in Mata; would there be a significant gain in speed?
Code:
forvalues i = 1/1000 { sort id var_`i' by id (var_`i'): gen newvar_`i' = var_`i'[_N] }
Comment