[pw = weightvar] versus svysetting the data

Richard Williams

Join Date: Apr 2014

Posts: 4992
#1

[pw = weightvar] versus svysetting the data

24 Apr 2014, 10:19

Suppose a data set has pweights but doesn't use stratification or clustering or whatever. Rob Mowry describes such a situation in this thread:

http://www.statalist.org/forums/foru...-data-in-stata

You can analyze such data either by tacking on a [pw=weightvar] option to each estimation command, or by svysetting the data. For example,

Code:

webuse nhanes2f, clear svyset, clear svyset [pw = finalwgt] logit diabetes i.female [pw=finalwgt] svy: logit diabetes i.female

In this case, the results are the same. Other than personal preference, is there a reason for preferring one approach over the other? You do get things like pseudo R^2 if you use the pw= approach, although I sort of wonder if the number is legitimate, given that you are also getting pseudo-likelihoods rather than likelihoods. My own bias is to use the svy: prefix, but I am not sure if there is a good statistical reason for that.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Tags: None
Joseph Luchman

Join Date: Mar 2014

Posts: 114
#2

25 Apr 2014, 09:00

Hi Richard,

This is an interesting question. Given the role of the weight (i.e., as an observation-level multiplier for the likelihood computations), the treatment shouldn't differ in terms of the results produced. My guess is that the svy prefix changes the output as what is sought through the svy prefix usually differs and tends to be more focused on the implications the design has on sampling variance (e.g., design effects, coefficient of variation, population size, etc.).

It would seem to me that because pseudo-likelihoods produce unbiased estimates of parameters, that model fit scalars such as the McFadden R²'s based on pseudo-likelihoods are still as useful as McFadden R^2's based on likelihoods for describing the model's relationship with the data as there's no inference involved, per se - but I'd be interested in hearing from any more seasoned survey experts/Stata Corp on this as I don't have any research/analysis to point to which (dis-)confirms my position on this issue.

- joe

Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
----
Research Fellow
Fors Marsh
----
Version 18.0 MP
Comment

Announcement

[pw = weightvar] versus svysetting the data

Comment