There is a lot to unpack in this thread. I will break this up into two posts. In this post I will look at what a difference in \(R^2\) could actually mean.
lets start with a bivariate regression: \(y = \beta_0 + \beta_1 x_1 + \varepsilon\) . The \(R^2\) is the proportion of the variance in \(y\) explained by the model: \(\frac{\mathrm{var}(\hat{y})}{\mathrm{var}(y)}= \frac{\mathrm{var}(\beta_0 + \beta_1 x_1)}{\mathrm{var}(y)} = \frac{\beta_1^2\mathrm{var}(x_1)}{\mathrm{var}(y )} \)
So if we compare the \(R^2\) across groups the \(R^2\) in group 1 could be higher than in group two because \(\beta_1\) is higher in group 1, or the variance in \(x\) is higher in group 1 or the overall variance in \(y\) is lower in group 1 (and if we assume that for the latter case that \(\beta_1\) and \(\mathrm{var}(x_1)\) are equal in group 1 and 2, then \(\mathrm{var}(\varepsilon)\) is lower in group 1). In real life it is going to be a combination of all three. So what does a difference in \(R^2\) across groups mean? It means that either \(x\) has a different effect across groups, or the variance of \(x\) is different across groups, or the variance of the unobserved other factors is different across groups, or any combination of these. In don't find that a very satisfying result.
Things get even more complicated when you add multiple explanatory variables. Rember that \(\mathrm{var}(x + z) = \mathrm{var}(x) + \mathrm{var}(z)-2\mathrm{cov}(x,z)\)
lets start with a bivariate regression: \(y = \beta_0 + \beta_1 x_1 + \varepsilon\) . The \(R^2\) is the proportion of the variance in \(y\) explained by the model: \(\frac{\mathrm{var}(\hat{y})}{\mathrm{var}(y)}= \frac{\mathrm{var}(\beta_0 + \beta_1 x_1)}{\mathrm{var}(y)} = \frac{\beta_1^2\mathrm{var}(x_1)}{\mathrm{var}(y )} \)
So if we compare the \(R^2\) across groups the \(R^2\) in group 1 could be higher than in group two because \(\beta_1\) is higher in group 1, or the variance in \(x\) is higher in group 1 or the overall variance in \(y\) is lower in group 1 (and if we assume that for the latter case that \(\beta_1\) and \(\mathrm{var}(x_1)\) are equal in group 1 and 2, then \(\mathrm{var}(\varepsilon)\) is lower in group 1). In real life it is going to be a combination of all three. So what does a difference in \(R^2\) across groups mean? It means that either \(x\) has a different effect across groups, or the variance of \(x\) is different across groups, or the variance of the unobserved other factors is different across groups, or any combination of these. In don't find that a very satisfying result.
Things get even more complicated when you add multiple explanatory variables. Rember that \(\mathrm{var}(x + z) = \mathrm{var}(x) + \mathrm{var}(z)-2\mathrm{cov}(x,z)\)