Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sum of squared deviation

    I have a dataset as follows:
    time return of stock a return of stock b eturn of stock c eturn of stock d eturn of stock e eturn of stock f eturn of stock g eturn of stock h
    1 0.433 0.4343 -0.661 0.8766 0.48 0.71 0.72 0.33
    n -0.5353 -0.424 0.146 0.11 0.97 0.23 -0.52 0.99
    I've entered random data but that's not the main topic.

    I'm looking at creating an extra variable in a column next to return of stock h that has the sum of squared differences of all stocks b-h from stock a for all time 1-n

    so effectively for time = 1 specifically, i would like (return of stock b - return of stock a)^2 + (return of stock c - return of stock a)^2 +(return of stock d - return of stock a)^2 etc. until stock h

    i'd like to repeat this process for time --> n for each row

    how would I go about doing this?

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte time float(ret_stock_a ret_stock_b ret_stock_c ret_stock_d ret_stock_e ret_stock_f ret_stock_g ret_stock_h)
    1   .433 .4343 -.661 .8766 .48 .71  .72 .33
    2 -.5353 -.424  .146   .11 .97 .23 -.52 .99
    end
    
    rename ret_stock_a base
    reshape long ret_stock_, i(time) j(stock_name) string
    by time, sort: egen wanted = total((ret_stock_ - base)^2)
    Now, if you have some good reason to return to the wide layout of your original data you can do that by running -reshape wide-. But bear in mind that almost all data management and analysis in Stata work best with data in long layout. In fact, many things in Stata are impossible in wide layout. So don't go back to wide unless you are sure that your next steps involved those rare things that require wide data.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have in this response. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hi Clyde,

      Thanks so much for your response. This was a real help.

      However, my dataset is n=1515 long so I don't think it can be manually inputted using the input function.
      How would I go about doing this, taking it directly from the datalist itself?

      Comment


      • #4
        Why do you think your dataset needs to be manually input? Clyde needed example data to test and demonstrate his code on. As his final paragraph points out, you neglected to provide usable example data, so he took what you showed and used it to build his code.

        Have you tried applying Clyde's code to your dataset? You will of course have to change the variable names to match whatever the variable names are in your dataset. There may be other changes you need to make.

        Clyde has demonstrated how to accomplish what you want to do. Your job now is to read his example, understand how it works, and adapt it to your actual dataset.

        Comment


        • #5
          And the way to do this in wide format is:

          Code:
          . foreach var of varlist ret_stock_b-ret_stock_h {
            2. gen `var'temp = (`var'- ret_stock_a)^2
            3. }
          
          . egen sqdif = rowtotal( ret_stock_btemp-ret_stock_htemp)
          
          . drop *temp

          Comment

          Working...
          X