Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intuition behind binscatter controls

    Hi!

    I am using binscatter to plot two variables, that only have 0 or positive values.

    Without using the controls option, all the binscatter blue dots are in positive territory, in the X and Y axis (see here) - as expected. Once I add controls (including fixed effects), however, some of the dots appear to have negative values, in the X axis (see here).

    What is the intuition behind this result? The helpfile says that the "controls" option residualizes X and Y variables on controls before plotting, but I struggle to get an intuitive explanation.

    For context: In my analysis, my units are municipalities. The X variable is immigration change, in percentage points of population, over a certain period. The Y variable is a prediction of immigration change (a separate variable), calculated using a shift-share instrument. Controls include changes in control variables, as well as regional fixed effects.

  • #2
    The helpfile says that the "controls" option residualizes X and Y variables on controls before plotting, but I struggle to get an intuitive explanation.
    Perhaps what the helpfile means is that the "controls" option regresses X and Y separately on the control variables, calculates the residuals from each of the regressions, and then plots the residuals from the Y regression against the residuals from the X regression.

    Note that binscatter is a community contributed command available from SSC, as the Statalist FAQ requests users to say when a command is not built into Stata.

    Comment


    • #3
      Originally posted by William Lisowski View Post

      Perhaps what the helpfile means is that the "controls" option regresses X and Y separately on the control variables, calculates the residuals from each of the regressions, and then plots the residuals from the Y regression against the residuals from the X regression.

      Note that binscatter is a community contributed command available from SSC, as the Statalist FAQ requests users to say when a command is not built into Stata.
      Thanks for the note Williams.

      How do you then interpret the chart? For context, I am trying to use this binscatter to illustrate the positive First Stage relationship between my instrument (m hat) and my endogenous variable (m) - after throwing in controls and fixed effects, which are used in my preferred model. I am just a bit confused with what I can say about it, given the negative values on the X axis.
      Last edited by Xavier Pedros; 01 Sep 2022, 12:43.

      Comment


      • #4
        Originally posted by Xavier Pedros View Post

        Thanks for the note Williams.

        How do you then interpret the chart? For context, I am trying to use this binscatter to illustrate the positive First Stage relationship between my instrument (m hat) and my endogenous variable (m) - after throwing in controls and fixed effects, which are used in my preferred model. I am just a bit confused with what I can say about it, given the negative values on the X axis.
        Sorry, just to add to my previous comment. The binscatter regression with controls and fixed effects plots a regression line of the residualized X and Y values, with the coefficient being 0.468***.

        I use Y as an instrument for X, in an IV model, where:

        W = a + bX + Controls + error1
        X = c + dY + Controls + error2

        The binscatter coefficient of 0.458*** is equal the "d" coefficient in the second equation.

        I do not understand the intuition behind why the binscatter coefficient (which runs two regressions for X and Y, and takes the residuals) is equal to the "d" coefficient.

        Comment


        • #5
          Statlist supports and encourages posting png images.

          Here is your binscatter of the raw data.

          Click image for larger version

Name:	UChG5I7.png
Views:	1
Size:	826.7 KB
ID:	1680360

          Here is your binscatter of the data with the controls option.

          Click image for larger version

Name:	Dsl2DMc.png
Views:	1
Size:	832.4 KB
ID:	1680361

          From post #1

          The helpfile says that the "controls" option residualizes X and Y variables on controls before plotting, but I struggle to get an intuitive explanation.
          I installed the community contributed binscatter from SSC and see that in the detailed description further down in the helpfile, it says

          controls(varlist) residualizes the x-variable and y-variables on the specified controls before binning and plotting. To do so, binscatter runs a regression of each variable on the controls, generates the residuals, and adds the sample mean of each variable back to its residuals.
          I can't really contribute any more than that, but perhaps others will recognize what it describes itself as doing and can provide some intuition.

          Please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. I

          Comment


          • #6
            Thanks William. Noted. Just a brief note that the first chart is with controls option, while the second one is the raw data.

            Comment

            Working...
            X