
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping simple linear regression and storing beta coefficient

    Hello everyone, i have a dataset in long format, it contains information of 3,167 school's perfomance in a test by year. rbd is the school id. It looks like this:

    rbd agno promedio 
    1 2014  205.9502 
    1 2015 191.57983
    1 2016  194.0618 
    1 2017  202.0082 
    1 2018       205 
    4 2014 263.53513        
    4 2015 279.52362        
    4 2016  270.3875       
    4 2017  267.1192        
    4 2018     277.5        
    5 2014  288.3368        
    5 2015 268.91162        
    5 2016  257.3042        
    5 2017 291.87463        
    5 2018     269.5
    I need to perform a simple regression model and store b1 coefficient. Before running the loop, i tried the commands for one rbd.

    reg promedio agno if rbd==1
    mat beta=e(b)
    svmat double beta, names(matcol)
    gen aux=.
    sum betaagno if rbd==1
    replace aux=r(mean) if rbd==1
    after running that commands, my database look like thisthis is exactly what i need)

    rbd agno promedio betaagno  beta_cons       aux
    1 2014  205.9502   .8527954     -1519       .8527954
    1 2015 191.57983                 .          .8527954
    1 2016  194.0618                 .          .8527954
    1 2017  202.0082                 .          .8527954
    1 2018       205                 .          .8527954
    4 2014 263.53513                 .                  .        .
    4 2015 279.52362                 .                   .        .
    4 2016  270.3875                 .                   .        .
    4 2017  267.1192                 .                   .        .
    4 2018     277.5                 .                   .        .
    5 2014  288.3368                 .                   .        .
    5 2015 268.91162                 .                   .        .
    5 2016  257.3042                 .                   .        .
    5 2017 291.87463                 .                   .        .
    5 2018     269.5                 .                   .        .
    I need help with the loop, i tried this:

    forvalues i= 1/5 {
    cap reg promedio agno if rbd==`i'
    mat beta=e(b)
    svmat double beta, names(matcol)
    sum betaagno if rbd==`i'
    replace aux=r(mean) if rbd==`i'
    drop betaagno beta_cons
    1)i think the commands to store b1 are not efficient to use it in a loop.
    2) I dont know how i can especify to the software that there are missing id values. (rbd 2, 3 are missing)

    please help.

    José Antonio

  • #2
    Don't use a loop. The whole thing is a one-liner:

    rangestat (reg) promedio agno, by(rbd) interval(agno . .)
    If you are not interested in the additional statistics other than the coefficient that this generates, you can always -drop- those variables.

    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer; it is available from SSC.

    Added: As for your second question, you do not need to do anything special to tell Stata that some rbd values are not instantiated in your data if you use this approach. It doesn't matter at all.


    • #3
      Amazing, thank you very much. Thats exactly what i need.

