Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Beta coefficient in fixed effect model

    Hi there,

    This might be a very easy question but I would like to report the beta ('β' -standardised coefficient) results for the fixed-effect model in panel data analysis (longitudinal dataset from Time 1 to 4). I know for the regular linear regression model we use "regress y x, beta" syntax but when I tried the same technique here, it does not work and I am not actually sure that I should use the same approach in the panel data fixed model with robust standard error. I have red published papers reporting standard coefficient- β- using STATA software.

    Can someone helps me with the syntax, please?

    Thanks in advance,
    Liyu
    I use STATA 15.1

  • #2
    As I wrote in an earlier post on this same topic today, using standardized coefficients in a fixed-effects regression takes a bad idea and makes it much worse.

    I won't trouble you with my usual rant on why standardized coefficients in ordinary regression are rarely useful and almost always serve to obfuscate the results. Instead, I'll focus here on the additional problems that arise in the context of panel data.

    What does "standardized" mean in the context of panel data? In a fixed effects regression, we are estimating exclusively within-panel effects. So using the standard deviation of a variable in the entire sample makes no sense--it would be an irrelevant "standard" at best, and in some situations would be dominated by the between-panel variation that is explicitly excluded from consideration in fixed-effects models. But that still leaves ambiguity. Do you want to standardize within each panel separately, or do you want to calculate a pooled standard deviation across the panels? How will you explain or justify this choice to your audience? What does it even mean? With either approach, nobody but you will have any clue what the standard deviation(s) used actually is(are), so it follows that nobody will have any idea what your regression coefficients mean.

    If you are thinking that you need standardization so that you can "compare" the effects of different variables in your model, that notion is discredited and illusory even in the context of non-hierarchical data (ordinary beta coefficients), and it is even farther from reality with panel data.

    There is no built-in command or option in Stata to do this. If you really have a compelling reason to go down this rabbit hole, you will have to simply standardize the variables (in whichever way you think is less nonsensical) yourself using the usual data management commands and then run the panel regression with them.
    Last edited by Clyde Schechter; 15 Nov 2018, 01:23.

    Comment


    • #3
      Realised that I didn't reply, sorry and thank you Clyde. That was usual.

      Liyu

      Comment


      • #4
        Hi Clyde,

        could you go into that and explain why this is the case: " ... standardized coefficients in ordinary regression are rarely useful and almost always serve to obfuscate the results."

        Many thanks!
        Elinor

        Comment


        • #5
          Let's take a simple example. Suppose I have a data set of observations of people in some situation and giving the period of time they were observed for and the amount of water they consumed over that time. I could do various things with that data, including things like regressing water consumption against time and reaching conclusions like "typically water is consumed at a rate of 100 g (3.5 oz) per hour." Almost any literate person will understand this result. Now suppose that I use standardized variables for the regression. My result might be something like "each 1 sd increment in time is associated with a 0.7 sd increment in water consumption." What on earth does that mean? The only person who can possibly interpret that is me, because I alone know what the standard deviations of the water and time variables in the data are. Even if I am kind enough to share the standard deviations with my audience/readers, to make sense of this result they now have to do mental calculations with multi-digit numbers to even get a sense of the magnitude of this association.

          Of course, there is a role for standardization of variables. When the raw variables themselves have no canonical unit of measurement that would be widely understood, then using a standardized variable enhances comprehensibility. So if I create some scale to measure a construct like job satisfaction, and regress it against, say, hourly pay, presenting a result like "each additional dollar per hour of pay was associated with an increase of 0.9 points on the job satisfaction scale" would not be helpful, because nobody (maybe not even me) knows whether 0.9 points on this scale is a big difference in job satisfaction or just a little one, or somewhere in between. But If I say that the additional dollar per hour of pay was associated with a 2 standard deviation difference in the job satisfaction scale" then people know that there is a lot of difference in job satisfaction. Notice, though, that if I also standardize the hourly pay variable, I reintroduce obscurity. That's because pay is canonically measured in currency units and everybody understands those.

          So the rule is pretty simple: if a variable has canonical units of measurement, use the variable in those units (or perhaps scaled by some power of 10 if appropriate). If a variable has no canonical units of measurement, standardize it.
          Last edited by Clyde Schechter; 04 Jun 2024, 14:52.

          Comment


          • #6
            Thank you, that was helpful!

            Comment

            Working...
            X