Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compute sample variance

    Hello everyone...New to Stata.

    I have a time series data set with 634 observations. I would like to know how to compute the sample variance over the first and the second half of the data. I know the command tabstat, but I do not know how to calculate the variance in this specific way. This is because I want to know if the variance is constant or not. Thank you very much.

  • #2
    It depends on how your data is organized. Can you give us an extract of your data. See the Statalist FAQ (black bar near the top of this page) on how to do this.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you very much. Please here is an extract of my data.

      varves time
      26.28 1
      27.42 2
      42.28 3
      58.28 4
      20.57 5
      28.57 6
      49.14 7
      24 8
      16 9
      42.28 10
      30.85 11
      36.57 12
      20.57 13
      24 14
      32 15
      18.28 16
      21.71 17
      11.42 18
      11.42 19
      12.57 20
      57.14 21
      60.57 22
      21.71 23
      29.71 24
      43.42 25
      40 26
      18.28 27
      21.71 28
      25.14 29
      20.57 30
      22.85 31
      28.57 32
      30.85 33
      16.66 34
      15.2 35
      15.58 36
      31.17 37
      48.97 38
      9 39
      Last edited by Rodrigo Soares; 03 Feb 2022, 10:12.

      Comment


      • #4
        The following example (a) shows how to present example data as requested in post #2 and (b) demonstrates code that works on this example data. In particular, it relies on the fact that your "time" variable starts at 1 and increments by 1, and assumes that that is how you wanted to define what "half the data" means.
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(varves time)
        26.28  1
        27.42  2
        42.28  3
        58.28  4
        20.57  5
        28.57  6
        49.14  7
           24  8
           16  9
        42.28 10
        30.85 11
        36.57 12
        20.57 13
           24 14
           32 15
        18.28 16
        21.71 17
        11.42 18
        11.42 19
        12.57 20
        end
        generate half = 1
        replace half = 2 if time > time[_N]/2
        tabstat varves, by(half) statistics(n mean variance)
        Code:
        . tabstat varves, by(half) statistics(n mean variance)
        
        Summary for variables: varves
        Group variable: half 
        
            half |         N      Mean  Variance
        ---------+------------------------------
               1 |        10    33.482  187.7684
               2 |        10    21.939  80.37059
        ---------+------------------------------
           Total |        20   27.7105  162.0766
        ----------------------------------------

        Comment


        • #5
          Thank you very much. Here in this case I have the variance for both the first half and second half of my data?

          Comment


          • #6
            You have the variance for the first half (half==1) and for the second half (half==2) of the 20 observations of sample data. It confirms that each half has 10 observations. If you run the code on your entire dataset you will have the variances for the halves of your data.

            Comment


            • #7
              Many thanks.

              Comment

              Working...
              X