Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cannot run median on household income variable in census PUMS data

    Hi,

    I downloaded PUMS 2020 household level data, and was trying to figure out the median household income (variable hincp) of the New York state.
    It works with the code: tabstat hincp, stat (median), [fweight=wgtp]

    But since I also want to calculate the margin of error, I set up the data as:

    svyset [pw=wgtp], sdrweight(wgtp1-wgtp80) vce(sdr)

    And after that, I can run the command to calculate the average of household income by:
    svy mean hincp

    However, when I try to calculate the median household income, it always returns:

    . svy sdr : proportion hincp
    (running proportion on estimation sample)
    hincp: factor variables may not contain negative values
    an error occurred when sdr executed proportion
    r(452);


    Can anybody help with that? Thank you so much!

  • #2
    Uhhhhhh, what are the negative values? And anyways, how can household income be negative? May I see your data using dataex? Surely something is going on that I can't see.

    Comment


    • #3
      Jared Greathouse -

      Household net income can easily be negative. Nothing coming in, expenses going out, savings and other assets decreasing.

      Linying He -

      The problem is that the output of help proportion tells us
      Code:
      Only numeric, nonnegative, integer-valued variables are allowed in varlist.
      and that is because
      Code:
      proportion produces estimates of proportions, along with standard errors,
      for the categories identified by the values in each variable of varlist.
      and treating each value of income as a separate category is not a way I am familiar with to get to a median and an estimate of its standard error, regardless of whether or not svy is being used, since these standard errors will be for the proportion in each individual category.

      Unfortunately as a non-user of the svy commands I cannot say how to get a weighted median in these circumstances. Perhaps someone else can advise further.

      Comment


      • #4
        Originally posted by Jared Greathouse View Post
        Uhhhhhh, what are the negative values? And anyways, how can household income be negative? May I see your data using dataex? Surely something is going on that I can't see.
        @Jared Greathouse
        here is the data download link for the data

        Comment


        • #5
          Linying He I can't see the link; the way you would do this in a reproducible example is something along the lines of

          Code:
          import delim "https://raw.githubusercontent.com/danilofreire/homicides-sp-synth/master/data/df.csv", clear
          or some equivalent thereof, where you rip the real dataset from the internet and put it in Stata. I do this all the time for real work. I use public sources for my minimal worked examples, however, not everyone on Statalist is willing to do this. So, I ask again for your example using dataex so others may comment on your data. Like William, I don't use svy ever, so I wouldn't know why you have the problem you do have, and I certainly couldn't diagnose it short of seeing your real data from dataex.

          William Lisowski I see what you mean.

          Comment


          • #6
            See one approach in https://www.statalist.org/forums/for...82#post1641882
            David Radwin
            Senior Researcher, California Competes
            californiacompetes.org
            Pronouns: He/Him

            Comment

            Working...
            X