Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Piecewise regression

    Dear Statalist. I was wondering how to know what intial value to enter when doing a piecewise regression? After a lot of testing i find that the c value that minimise the MSE is 28. However, this required a lot of effort. When entering a low value for example 5 I get a relativly high MS value, the MS value jumps up an down until I tested a c value of 28 which gave the optimal result. When I put a c value of more than 28 the MS increases again. Any suggestion on how to find the optimal value that minimse the MS more effectivly?

    My command is: nl (KOSTBHG = SIZE*{b1} + (SIZE>{c})*( SIZE-{c})*{b2}), variables(SIZE) initial(b1 0 c 28 b2 0) noconstant

  • #2
    You could try a grid search:

    Code:
    clear all
    
    sysuse nlsw88, clear
    
    // compute the model for a fixed knot
    program define totry, rclass
        syntax, c(real)
        
        tempvar ttl_exp2
        gen double `ttl_exp2' = (ttl_exp > `c')*(ttl_exp - `c')
        reg wage ttl_exp `ttl_exp2'
        return scalar c  = `c'
        return scalar b0 = _b[_cons]
        return scalar b1 = _b[ttl_exp]
        return scalar b2 = _b[`ttl_exp2']
    end
    
    // try a range of knots, and choose the knot with the lowest rmse
    sum ttl_exp, d
    
    local range = r(p95) - r(p5)
    local begin = r(p5)
    local finish = r(p95)
    local d = `range'/20
    
    tempname minrmse
    scalar `minrmse' = .
    
    forvalues i = `begin'(`d')`finish' {
        qui totry, c(`i')
        if e(rmse) < `minrmse' {
            scalar `minrmse' = e(rmse)
            local c  = r(c)
            local b0 = r(b0)
            local b1 = r(b1)
            local b2 = r(b2)
    
        }
    }
    
    // display those initial values
    di `c'
    totry, c(`c')
    
    // estimate the end result
    nl (wage = {b0} + ttl_exp*{b1} + (ttl_exp>{c})*( ttl_exp-{c})*{b2}), ///
       variables(ttl_exp) initial(b0 `b0' b1 `b1' c `c' b2 `b2')
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you for the quick reply! I am new to STATA and I was wondering if there was an easier way to do it? Would it be possible to see the knot from a scatter plot?

      I also wondering how STATA decides the number of iterations? And what does actually an interation do? And why does the number of iterations change with a different "c" value?

      Comment


      • #4
        Looking at a graph is a perfectly fine way to find a first guess for the value c if the graph is clear enough. In order to find an "easier" way to do things, we need to find out what you find "hard". You may find this: http://www.maartenbuis.nl/workshops/.../stata_l2.html helpful when reconstructing what my code does.

        I suspect with iteration you mean the iterations of nl . All the technical details of what nl does are discussed in the manual in the section methods and formulas. So you type in Stata help nl , this will open the help file, at the top of the help file you see a link to "(View complete PDF manual entry)", click on that and you are in the manual, and than you can scroll to the section methods and formulas.
        Last edited by Maarten Buis; 08 Aug 2019, 05:42.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Dear Maarten Buis, I plan to include the initial value of the dependent variable GINI coefficient as a regressor to control the initial difference in inequality among the countries (in a panel data analysis).
          I ran the following STATA code for Unconditional quantile regression. However, I failed to include the initial value of the dependent variable in the regression. Could you suggest how to include it?
          sqreg gini social_protection_Spending lnlnpercapit trade d_gdp inf_cpi unemp government_effctivenss , quantile(.1 .2 .3 .4 .5 .6 .7 .8 .9) reps(800)
          Best

          Comment

          Working...
          X