Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a random sample which has a regression line through origin

    Dear all,

    I would like to generate a random sample (1 dependent variable, 1 independent variable, and 50 observations) which has the regression line (fitted values) come through the origin (Of course it doesn't look like the graph in the attached file. I want the value of Y to be somewhat "fluctuated"). However, I don't know how to do that?

    Is there anybody here can help me?

    I would really appreciate all the help I can get.

    Best regards.

    Click image for larger version

Name:	Untitled1.png
Views:	1
Size:	22.7 KB
ID:	1450184
    --------------------
    (Stata 15.1 MP)

  • #2
    You could use -corr2data-. E.g.,

    Code:
    clear *
    matrix C = (1, .5 \ .5, 1) // replace .5 with desired correlation
    corr2data x y, n(50) means(0 0) sds(1 1) corr(C)
    regress y x
    twoway scatter y x || lfit y x, xlab(,grid) ylab(,grid)

    HTH.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      Hi Bruce. Thanks for your quick response.

      If I want to limit the range of x is the interval (0,1) and the range of y is the interval (0,1), how can I do that?
      --------------------
      (Stata 15.1 MP)

      Comment


      • #4
        Off the top of my head, I don't know how to impose those restrictions. Perhaps someone else will jump in.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          I don't think there is a neat solution to this - handling two of the three conditions (through origin, range on y and x) are easy but the third makes it hard. This is not elegant but gives close to what you want:

          set seed 9999
          gen x=runiform()
          g e=runiform()
          g y= (x * e)

          su
          reg y x
          predict yhat
          twoway scatter yhat x

          Comment


          • #6
            Thanks Phil and Bruce!
            --------------------
            (Stata 15.1 MP)

            Comment


            • #7
              Linh:
              remember to -set- the number of observations to make Phil's helpful code running.
              I would also advise you to consider the idiosincratic erros (e) additive to the predictor (x) (see https://www.stata.com/bookstore/heal...s-using-stata/, page 47):
              Code:
              . set seed 9999
              
              . set obs 50
              
              .  gen x=runiform()
              
              .  g e=runiform()
              
              .  g y=x + e
              
              .  reg y x
              
                    Source |       SS           df       MS      Number of obs   =        50
              -------------+----------------------------------   F(1, 48)        =     45.67
                     Model |  3.43090904         1  3.43090904   Prob > F        =    0.0000
                  Residual |   3.6057322        48  .075119421   R-squared       =    0.4876
              -------------+----------------------------------   Adj R-squared   =    0.4769
                     Total |  7.03664124        49  .143604923   Root MSE        =    .27408
              
              ------------------------------------------------------------------------------
                         y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                         x |   .9232128   .1366071     6.76   0.000     .6485458     1.19788
                     _cons |   .6183186   .0776361     7.96   0.000     .4622206    .7744165
              ------------------------------------------------------------------------------
              
              .  predict yhat
              (option xb assumed; fitted values)
              
              .  twoway scatter yhat x *see resulting graph attached*
              Attached Files
              Last edited by Carlo Lazzaro; 26 Jun 2018, 03:41.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                I don't have access to the book Carlo pointed to in #7, but I suspect he had something like this in mind:

                Code:
                * Phil's solution in #5 with modifications
                
                local b  = 1  // intended slope for regression line--modify if you wish
                local d = 3.5 // divisor for estimating se below
                
                clear *
                set obs 50
                set seed 99999
                gen x=runiform()
                generate se = (0.5- abs(x-.5)) / `d'
                generate e = rnormal(0,se)
                generate y= `b'*x + e
                
                summarize x y // check that x and y are within desired range
                regress y x, noheader
                display _newline ///
                "Yhat when X=0:  "_b[_cons] + _b[x]*0 _newline ///
                "Yhat when X=1:   "_b[_cons] + _b[x]*1
                twoway scatter y x || lfit y x, xlab(0(.1)1,grid) ylab(0(.1)1,grid)
                Bear in mind that some trial & error was required to arrive at settings that yielded X and Y values within the desired 0-1 range while still meeting the other requirements reasonably closely. HTH.
                --
                Bruce Weaver
                Email: [email protected]
                Version: Stata/MP 18.5 (Windows)

                Comment


                • #9
                  Bruce greatly improved my idea!
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment

                  Working...
                  X