Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Synthetic Data ( Poisson Regression)

    Hello

    I'm trying to get a grip of generating synthetic data but I'm having some difficulties ?

    Following Hilbe2010 http://www.stata-journal.com/article...article=st0186
    I want to synthetically generate data to analyze using various count data models — starting with poisson.

    So I use the code:

    clear
    set obs 50000
    set seed 4744
    generate x1 = invnormal(runiform())
    generate x2 = invnormal(runiform())
    generate xb = 2 + 0.75*x1 - 1.25*x2
    generate exb = exp(xb)
    generate py = rpoisson(exb)
    poisson py x1 x2
    % And then I would like to test the overdispersion
    %following Cameron1990 http://www.sciencedirect.com/science...0440769090014K
    predict muhat,n
    generate ystar = ((py-muhat)^2 - py)
    regress py muhat, noconstant noheader
    %so a alpha =1 indicates overdispersion

    So far so good. And now I would like to generate the Data without overdispersion namely Poisson(1).
    As far as I understood I have to have a E(exb)=1, and Var(exb)=1 ??
    So to get E(exb)=1 E(xb) has to be =0 so any regression with E(xb)=0 would do f.ex. = 1 + b1x1 - b2(=2*b1)x2 when xi N(1,1)
    further we have to have Var(exb)=E(exb) here I'm stuck ... ???
    Any advice how I could proceed ?

    Thank you





  • #2
    ok I found it....

    Code:
    regress py muhat, noconstant noheader
    should of course be :
    Code:
    regress ystar muhat, noconstant noheader

    Comment


    • #3
      Dear Oscar,

      Conditioning on x1 and x2, the data you are generating does not have overdispersion. Anyway, to generate Poisson(1) just do
      Code:
      g y=rpoisson(1)
      Best regards,

      Joao

      Comment


      • #4
        Originally posted by Joao Santos Silva View Post
        Dear Oscar,

        Conditioning on x1 and x2, the data you are generating does not have overdispersion. Anyway, to generate Poisson(1) just do
        Code:
        g y=rpoisson(1)
        Best regards,

        Joao
        Thank you Joao,
        my question is how could I formulate everything from the scratch so I get a py ~ Poisson(1). I think I have found a solution:
        Code:
        clear all
        set more off                                
        set obs 1000000                                
        set seed 1234
        generate x1=rnormal(1,.1)
        generate x2=rnormal(1,.1)
        generate xb = 1 + x1 - 2*x2     //E(xb)=0 
        generate exb = exp(xb)            //E(exb)=1
        generate py = rpoisson(exb)
        poisson py x1 x2
        predict muhat,n
        generate ystar = ((py-muhat)^2 - py)/muhat
        regress ystar muhat, noconstant noheader
        So far so good.

        Going back to my initial Code I have no overdispersion (using
        Code:
        generate ystar = ((py-muhat)^2 - py)
        regress py muhat, noconstant noheader
        ). But when I summarize my py I can see that the variance is way over the mean how is this possible ??



        Comment


        • #5
          Dear Oscar,

          You are mixing up conditional and unconditional equidispersion. You should look at a good textbook covering count data models to see the difference between the two concepts.

          In all the cases you are generating you have conditional equidispersion and unconditional overdispersion. The only way to generate data with unconditional equidispersion is to generate the Poisson variable without making it a function of x1 and x2, as I suggested.

          Best regards,

          Joao

          Comment

          Working...
          X