Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing coefficients across xtreg and saving results using a loop and macro

    Hi, I am trying to run some basic tests across fixed effects models and save the p-value (Prob>F).

    Here's some basic syntax below that mirrors my (much more involved and more complicated) modeling, where I progressively test different coefficients across race and south, and do this for multiple outcomes within a foreach loop.

    Ultimately, I would like to store each resulting p-value into a data frame that identifies the variable, outcome, and category (race and south) it came from. If anyone has recommendations to do that within a loop framework like the below, I would appreciate them!



    Code:
    use "https://www.stata-press.com/data/r17/nlswork.dta", clear
    
    foreach var of varlist hours ln_wage {
    
    /***/ Race 
    xtreg `var' tenure ttl_exp i.year if race == 1, fe vce(cluster idcode)
    local coef_tenurewhite_`var' =_b[tenure]
    local coef_expwhite_`var' =_b[ttl_exp]
    
    xtreg `var' tenure ttl_exp i.year if race == 2, fe vce(cluster idcode)
    test `coef_tenurewhite_`var'' =_b[tenure]
    test `coef_expwhite_`var'' =_b[ttl_exp]
    
    /***/ South
    xtreg `var' tenure ttl_exp i.year if south == 1, fe vce(cluster idcode)
    local coef_tenuresouth_`var' =_b[tenure]
    local coef_expsouth_`var' =_b[ttl_exp]
    
    xtreg `var' tenure ttl_exp i.year if south == 0, fe vce(cluster idcode)
    test `coef_tenuresouth_`var'' =_b[tenure]
    test `coef_expsouth_`var'' =_b[ttl_exp]
    
    }

  • #2
    Code:
    clear*
    use "https://www.stata-press.com/data/r17/nlswork.dta"
    
    frame create results str32(variable outcome category) int category_level ///
        float (coefficient omnibus_p_value)
    foreach var of varlist hours ln_wage {
        levelsof race, local(races)
        foreach r of local races {
            xtreg `var' tenure ttl_exp i.year if race == `r', fe cluster(idcode)
            frame post results ("tenure") ("`var'") ("race") (`r') (_b[ttl_exp]) (e(p))
            frame post results ("ttl_exp") ("`var'") ("race") (`r') (_b[tenure]) (e(p))
        }
        
        levelsof south, local(souths)
        foreach s of local souths {
            xtreg `var' tenure ttl_exp i.year if south == `s', fe cluster(idcode)
            frame post results ("tenure") ("`var'") ("south") (`s') (_b[ttl_exp]) (e(p))
            frame post results ("ttl_exp") ("`var'") ("south") (`s') (_b[tenure]) (e(p))
        }
    }
    
    frame change results
    It is possible to consolidate the separate code for race and south into a single loop over those variables, and the separate -frame post-s for tenure and ttl_exp could also be made into a loop nested in that. But I think it makes the code more opaque, and difficult to subsequently maintain or modify. When something iterates over only two values I usually spell that out twice rather than wrap it in a loop, the notable exception being -forvalues i = 0/1- loops, which are very common and perfectly clear.

    Your code shown in #1 does not implement anything for the p-value (Prob > F). I don't know if that's because you don't really want it, or because you weren't sure how to do it. Anyway, it's included here. Just remember two things: 1) the only test statistic that gets an F-test in this context is the total model F-statistic, so this result is going to be the same for all results associated with the same regression. It will not vary with the change between ttl_exp and tenure. 2) The result of that omnibus model F-statistic is seldom of any interest in real-world research problems. Are you sure you wanted that, and not the p-values of the coefficient's t-tests?

    Comment


    • #3
      Thanks Clyde Schechter, my aim was to test (in following my code example) whether, e.g., the coefficient for -tenure- among whites is significantly different from the coefficient for -tenure- among non-whites. My impression from this thread (https://www.statalist.org/forums/for...-i-year-and-fe) was that my code would do that, but perhaps that's not true.

      Comment


      • #4
        my aim was to test (in following my code example) whether, e.g., the coefficient for -tenure- among whites is significantly different from the coefficient for -tenure- among non-whites
        Well, you can't do that with the type of output you are generating anyway. Having the coefficients at both values of race or south and their test statistics won't do it. You need to actually do a different regression, with an interaction between black (resp. south) and tenure and ttl_exp to get the difference between the coefficients when race = 1 vs race = 2 (resp. south = 0 vs 1).

        Code:
        clear*
        use "https://www.stata-press.com/data/r17/nlswork.dta"
        
        frame create results str32(variable outcome category) ///
            float (difference p_value_diff)
        foreach var of varlist hours ln_wage {
            xtreg `var' i.race##c.(tenure ttl_exp) i.year if inlist(race, 1, 2), ///
                fe cluster(idcode)
            matrix M = r(table)
            frame post results ("tenure") ("`var'") ("race")  ///
                (M["b", "2.race#c.ttl_exp"]) (M["pvalue", "2.race#c.ttl_exp"])
            frame post results ("ttl_exp") ("`var'") ("race") ///
                (M["b", "2.race#c.tenure"]) (M["pvalue", "2.race#c.tenure"])
            
            levelsof south, local(souths)
            xtreg `var' i.south##c.(tenure ttl_exp) i.year, fe cluster(idcode)
            matrix M = r(table)
            frame post results ("tenure") ("`var'") ("south")  ///
                (M["b", "1.south#c.ttl_exp"]) (M["pvalue", "1.south#c.ttl_exp"])
            frame post results ("ttl_exp") ("`var'") ("south") ///
                (M["b", "1.south#c.tenure"]) (M["pvalue", "1.south#c.tenure"])
        }
        
        frame change results

        Comment

        Working...
        X