Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable r2 values from Excel / Sigmaplot graph and Stata regression

    Hello,

    I am running a linear regression on some data I have, with the model passing through the origin - ie. no constant.
    I have plotted the data as scatterplots in Excel and added a trend line with a 0 intercept (a justifiable approach in the ecological research I am doing).
    I have also run regression on the data in Stata and Excel using the noconst function to run the regression with a 0 intercept.

    The r2 values that Stata and Excel give for the linear regression analysis are always much higher than those that appear when I "show r2 value" on the Excel graph. Considering the data, it is the lower graph r2 values that look more appropriate.

    Would anyone be able to help with a reason for why these values are different, suggest which ones are the correct value or have another solution that will allow me to run the regression and gain correct r2 and significance values?

    Usually I would avoid Excel entirely for such analysis other than having a very quick look at the data, but, this time I feel it may have flagged up an important issue.

    Any help would be much appreciated,

    Many thanks

  • #2
    There is no example data or output here to discuss and in particular I don't think you'll find many expert Excel users on Statalist.

    See e.g. https://stats.idre.ucla.edu/other/mu...ut-a-constant/ for general discussion and Stata illustration.

    For regression without an intercept, my own view is that a better measure is the square of correlation between observed and predicted, which is easy enough to calculate after a regression but not what regress returns as R-square.

    Comment


    • #3
      Hi Nick,

      Thanks for your reply. The link you post certainly helps a little toward understanding why.
      I have attached the Excel file with my data on and pasted below the stata regression no constant output. You will see that the r2 value is much higher.


      regress contour linear, noconst

      Source | SS df MS Number of obs = 32
      -------------+---------------------------------- F(1, 31) = 1504.79
      Model | 10520.273 1 10520.273 Prob > F = 0.0000
      Residual | 216.727016 31 6.99119407 R-squared = 0.9798
      -------------+---------------------------------- Adj R-squared = 0.9792
      Total | 10737 32 335.53125 Root MSE = 2.6441

      ------------------------------------------------------------------------------
      contour | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      linear | 1.572772 .0405441 38.79 0.000 1.490082 1.655462
      ------------------------------------------------------------------------------


      I now understand why this may be, but, would like to be sure that the r2 and significance values I report are as accurate as possible.

      Would this be the type of command to use for the square or correlation method?

      predict weightp if e(sample) . corr weight weightp if e(sample) . di r(rho)^2 (from your do-it yourself FAQ)

      Finally, if you favor a different method for the r2 value, would you recommend another method for obtaining the p, t and F values?

      Many thanks
      Attached Files

      Comment

      Working...
      X