Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Twoway Line Chart for Regression Dummy Coefficients

    Dear all,

    Suppose I have a large dataset that comprises the age and income of Individuals. Now assume I regress the income on age but use age dummies instead of treating it as a continous variables. Afterwards I visualize the age dummy coefficients within a lineplot. The X-Axis comprise the age (going from 18-80 years) and the Y-Axis the Coefficient values. I can achieve this task by using following code:
    Code:
    * Create arbitrary dataset
    clear
    set obs 10000
    egen age = seq(), f(18) t(80)
    gen income = rnormal(4000,1000)
    
    *Regress and plot age dummy coefficients:
    gen beta1=.
    gen n1=_n if _n>17 & _n<=80
    xi: reg income i.age
    forvalues number=19/80 {
    replace beta1=_b[_Iage_`number'] if _n==`number'
    }
    twoway (line beta1 n1, graphregion(color(white)) ylabel(, nogrid) xtitle(Age) ytitle(Wage) lcolor(black))
    Resulting in the following graph which is exactly what I want:
    Click image for larger version

Name:	Graph1.png
Views:	1
Size:	40.9 KB
ID:	1612113



    However, now comes the tricky part. I want to repeat this procedure but instead of using the age dummies for every year (18, 19, 20, ..., 79, 80) I want to use age dummies that comprise 10 year intervals (e.g. if someone is in his twenties, thirties, fourties...) for which I generated the appropriate variable called "age_10":

    Code:
    *Create Age variable comprising 10 year intervals
    g age_10=.
    replace age_10=10 if age<20
    replace age_10=20 if age>=20 & age<30
    replace age_10=30 if age>=30 & age<40
    replace age_10=40 if age>=40 & age<50
    replace age_10=50 if age>=50 & age<60
    replace age_10=60 if age>=60 & age<700
    replace age_10=70 if age>=70 & age<80
    replace age_10=80 if age>=80
    Unfortunatly if I proceed using the same method as above, the resulting plot is just empty, even though the x-axis and y-axis are scaled correctly- basically just the line is missing. I used the following Code to create the second plot:

    Code:
    *Regress and plot 10-year age dummy coefficients:
    gen beta2=.
    gen n2=_n*10 if _n>1 & _n<=8
    xi: reg income i.age_10
    foreach number of numlist 20 30 40 50 60 70 80 {
    replace beta2=_b[_Iage_10_`number'] if _n==`number'
    }
    twoway (line beta2 n2, graphregion(color(white)) ylabel(, nogrid) xtitle(Age) ytitle(Wage) lcolor(black))
    Creating following Graph:
    Click image for larger version

Name:	Graph2.png
Views:	1
Size:	9.7 KB
ID:	1612115




    Any help to find my mistake or to come up with a different solution would be highly appreciated.

    Best regards,

    Neil
    Last edited by Neil Murray; 28 May 2021, 08:12.

  • #2
    xi: is obsolete unless you have a very old Stata, which you are asked to tell us. That isn't the problem. When I run this my fake data are different but the problem is where you put the coefficients.

    Code:
    . list beta2 n2 if beta2 < . | n2 < . 
    
           +---------------+
           |    beta2   n2 |
           |---------------|
        2. |        .   20 |
        3. |        .   30 |
        4. |        .   40 |
        5. |        .   50 |
        6. |        .   60 |
           |---------------|
        7. |        .   70 |
        8. |        .   80 |
       20. | 12.27108    . |
       30. | 91.22677    . |
       40. | 73.43433    . |
           |---------------|
       50. | 44.41222    . |
       60. | 151.7659    . |
       70. | 80.22237    . |
       80. | 246.6667    . |
           +---------------+
    As a secondary detail, note that

    Code:
    g age_10 = 10 * floor(age/ 10)
    speeds up your binning. See also https://www.stata-journal.com/articl...article=dm0095 or https://www.stata-journal.com/articl...article=dm0002

    Comment


    • #3
      Thank you very much for identifying my mistake and also giving me the advice for easy binning- that helps a lot!

      I used the following workaround to come up with the results I wanted:
      Code:
       gen beta2=.  
      gen n2=_n*10 if _n>1 & _n<=8  
      xi: reg income i.age_10  
      foreach number of numlist 20 30 40 50 60 70 80 {  
      replace beta2=_b[_Iage_10_`number'] if _n==`number'/10 }
      
       twoway (line beta2 n2, graphregion(color(white)) ylabel(, nogrid) xtitle(Age) ytitle(Wage) lcolor(black))

      Comment

      Working...
      X