Hello everyone,
Here is a dataset example as an illustration of my issue

This dataset can be reproduced by :
For each id, I would like to compute the total for some test? columns. The first column to be included in the total is given by the mid variable.
For example, for the first id, tid = 2, so the sum to be computed is from test2 to test8. From test3 to test8 for the second id, etc....
I thought to two different codes to handle this :
I find sum1 ugly and slow but it works!
I wanted something faster and tried sum2. Although I considered egen and gen commands as loops that were evaluated for each line, it rather looks that the tid[_n] is evaluated for the 1st line and is kept for all the remaining lines. Is this a correct interpretation of the wrong results I get ?
Is there a more elegant (and faster!) way of handling my problem?
Thank you for your time and attention !
Here is a dataset example as an illustration of my issue
This dataset can be reproduced by :
Code:
clear set obs 10 gen id = _n set seed 123 gen tid = int(5*runiform()+1) forvalues j=1/8 { generate test`j'=int(10*uniform()) }
For example, for the first id, tid = 2, so the sum to be computed is from test2 to test8. From test3 to test8 for the second id, etc....
I thought to two different codes to handle this :
Code:
gen sum1 = . forvalues j=1/`c(N)' { local nid = tid[`j'] egen buff = rowtotal(test`nid'-test8) replace sum1 = buff in `j' drop buff } egen sum2 = rowtotal(test`=tid[_n]'-test8)
I wanted something faster and tried sum2. Although I considered egen and gen commands as loops that were evaluated for each line, it rather looks that the tid[_n] is evaluated for the 1st line and is kept for all the remaining lines. Is this a correct interpretation of the wrong results I get ?
Is there a more elegant (and faster!) way of handling my problem?
Thank you for your time and attention !
Comment