Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create var with only first observation

    Hello,
    Can someone help me with the code to create a variable "initial leverage" with using only the first observation of bklevg for each firm? I managed to do it for one year only but with that proceed there are some values missing because some firms have their first observation later. Thank you

  • #2
    My wild guess, given that there's not dataset for me to work with here, is
    Code:
    qbys group: g first = _n if !mi(var)
    Haven't tested this, but it counts from 1...total observations in a group if the value isn't missing. Note that this presumes all values prior to the first observation are set to missing.

    Comment


    • #3
      thank you for your answer, i have the book leverage for several years and firms, now I want to create the variable initial leverage and use only the first observation for each firm. I use the compustat dataset.

      Comment


      • #4
        Did you try my code?

        And no, when I say dataset, I really mean the dataset you'd get by using the dataex command, so I can literally see what your real data look like.

        Comment


        • #5
          You don't need to email me directly asking for assistance 🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣.

          I promise you, if you simply use the dataex command I mentioned above (as the FAQ asks of you), and show me what your data look like, I can likely solve this in two seconds and this'll go much smoother.

          Comment


          • #6
            Thank you for your help, I appreciate it. I'm really a beginner in Stata. I tried your code but I think it's not exactly what I was looking for. Some firms got their first observations later, for example in 1975 instead of 1965, like most others. With your code these first observations don't get the value 1. Please correct me if I'm wrong.
            My data looks like this,
            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str6 gvkey double fyear float bklevg
            "001000" 1965   .6294372
            "001000" 1966   .5074074
            "001000" 1967  .56229645
            "001000" 1968  .25954068
            "001000" 1969   .3045417
            "001000" 1970   .3974589
            "001000" 1971   .4491988
            "001000" 1972   .3516351
            "001000" 1973   .3215286
            "001000" 1974  .29253453
            "001000" 1975  .24442585
            "001000" 1976   .3610895
            "001000" 1977   .4236002
            "001001" 1983   .3454545
            "001001" 1984   .2937235
            "001001" 1985   .5125712
            "001002" 1968   .4297259
            "001002" 1969   .4234216
            "001002" 1970   .3232765
            "001002" 1971  .13577497
            "001002" 1972  .22008166
            "001003" 1982   .2224787
            "001003" 1983  .14069645
            "001003" 1984  .11527727
            "001003" 1985  .33595425
            "001003" 1986   .2579871
            "001003" 1987   .3433487
            "001003" 1988   .4748157
            "001003" 1989   .4476209
            "001004" 1965  .19849147
            "001004" 1966   .1651492
            "001004" 1967  .22930327
            "001004" 1968  .34810615
            "001004" 1969    .335549
            "001004" 1970   .3425156
            "001004" 1971  .31864735
            "001004" 1972  .28373775
            "001004" 1973   .3468845
            "001004" 1974   .5356954
            "001004" 1975   .4737132
            "001004" 1976   .4840834
            "001004" 1977    .474585
            "001004" 1978   .4885971
            "001004" 1979  .46216005
            "001004" 1980    .467108
            "001004" 1981   .3832191
            "001004" 1982   .3120642
            "001004" 1983  .11526802
            "001004" 1984  .18887423
            "001004" 1985   .2869578
            "001004" 1986   .2054056
            "001004" 1987   .2549553
            "001004" 1988  .26930532
            "001004" 1989  .27321562
            "001004" 1990   .2249012
            "001004" 1991  .23353426
            "001004" 1992  .25009653
            "001004" 1993  .27847165
            "001004" 1994  .28509632
            "001004" 1995  .27353454
            "001004" 1996   .2233678
            "001004" 1997   .2650714
            "001004" 1998   .2495892
            "001004" 1999  .27903044
            "001004" 2000  .27589643
            "001004" 2001     .36641
            "001004" 2002   .3741715
            "001004" 2003   .3553656
            "001005" 1974   .2303457
            "001005" 1975   .2177684
            "001005" 1976  .16471837
            "001005" 1977  .18459633
            "001005" 1978  .30025125
            "001005" 1979   .3223435
            "001005" 1980  .14247936
            "001005" 1981  .29187092
            "001006" 1975    .083391
            "001006" 1976  .04948522
            "001006" 1977 .015176152
            "001006" 1978 .013190437
            "001006" 1979 .008068583
            "001006" 1980 .036235955
            "001006" 1981 .005572441
            "001006" 1982  .08350141
            "001007" 1976 .006148089
            "001007" 1977  .08268563
            "001008" 1983  .05714286
            "001008" 1984   .1839671
            "001008" 1985  .03626943
            "001009" 1981   .5390736
            "001009" 1982   .6598155
            "001009" 1983   .8872525
            "001009" 1984   .7358214
            "001009" 1985   .5620438
            "001009" 1986   .6327684
            "001009" 1987   .6534248
            "001009" 1988   .4867115
            "001009" 1989    .633677
            "001009" 1990    .648647
            "001009" 1991   .5955454
            end

            Comment


            • #7
              I'm confident a Stata Wizard exists who could tell me why my solution is inefficient, but is this about it?
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str6 gvkey double fyear float bklevg
              "001000" 1965   .6294372
              "001000" 1966   .5074074
              "001000" 1967  .56229645
              "001000" 1968  .25954068
              "001000" 1969   .3045417
              "001000" 1970   .3974589
              "001000" 1971   .4491988
              "001000" 1972   .3516351
              "001000" 1973   .3215286
              "001000" 1974  .29253453
              "001000" 1975  .24442585
              "001000" 1976   .3610895
              "001000" 1977   .4236002
              "001001" 1983   .3454545
              "001001" 1984   .2937235
              "001001" 1985   .5125712
              "001002" 1968   .4297259
              "001002" 1969   .4234216
              "001002" 1970   .3232765
              "001002" 1971  .13577497
              "001002" 1972  .22008166
              "001003" 1982   .2224787
              "001003" 1983  .14069645
              "001003" 1984  .11527727
              "001003" 1985  .33595425
              "001003" 1986   .2579871
              "001003" 1987   .3433487
              "001003" 1988   .4748157
              "001003" 1989   .4476209
              "001004" 1965  .19849147
              "001004" 1966   .1651492
              "001004" 1967  .22930327
              "001004" 1968  .34810615
              "001004" 1969    .335549
              "001004" 1970   .3425156
              "001004" 1971  .31864735
              "001004" 1972  .28373775
              "001004" 1973   .3468845
              "001004" 1974   .5356954
              "001004" 1975   .4737132
              "001004" 1976   .4840834
              "001004" 1977    .474585
              "001004" 1978   .4885971
              "001004" 1979  .46216005
              "001004" 1980    .467108
              "001004" 1981   .3832191
              "001004" 1982   .3120642
              "001004" 1983  .11526802
              "001004" 1984  .18887423
              "001004" 1985   .2869578
              "001004" 1986   .2054056
              "001004" 1987   .2549553
              "001004" 1988  .26930532
              "001004" 1989  .27321562
              "001004" 1990   .2249012
              "001004" 1991  .23353426
              "001004" 1992  .25009653
              "001004" 1993  .27847165
              "001004" 1994  .28509632
              "001004" 1995  .27353454
              "001004" 1996   .2233678
              "001004" 1997   .2650714
              "001004" 1998   .2495892
              "001004" 1999  .27903044
              "001004" 2000  .27589643
              "001004" 2001     .36641
              "001004" 2002   .3741715
              "001004" 2003   .3553656
              "001005" 1974   .2303457
              "001005" 1975   .2177684
              "001005" 1976  .16471837
              "001005" 1977  .18459633
              "001005" 1978  .30025125
              "001005" 1979   .3223435
              "001005" 1980  .14247936
              "001005" 1981  .29187092
              "001006" 1975    .083391
              "001006" 1976  .04948522
              "001006" 1977 .015176152
              "001006" 1978 .013190437
              "001006" 1979 .008068583
              "001006" 1980 .036235955
              "001006" 1981 .005572441
              "001006" 1982  .08350141
              "001007" 1976 .006148089
              "001007" 1977  .08268563
              "001008" 1983  .05714286
              "001008" 1984   .1839671
              "001008" 1985  .03626943
              "001009" 1981   .5390736
              "001009" 1982   .6598155
              "001009" 1983   .8872525
              "001009" 1984   .7358214
              "001009" 1985   .5620438
              "001009" 1986   .6327684
              "001009" 1987   .6534248
              "001009" 1988   .4867115
              "001009" 1989    .633677
              "001009" 1990    .648647
              "001009" 1991   .5955454
              end
              
              
              destring gvkey, replace
              
              replace bklevg = . if gvkey == 1009 & fyear ==1981 
              
              xtset gvkey
              
              qbys gvkey: g first = _n if !mi(bklevg)
              
              tempvar firstmiss
              
              g `firstmiss' = 1 if first ==.
              
              levelsof gvkey if `firstmiss' ==1, l(ID)
              
              foreach x of loc ID {
                  
                  replace first = first-1 if gvkey ==`x'
              }
              
              br

              Comment


              • #8
                Three problems need to be kept apart,

                1 when the first non-missing value occurs
                2 the year being that with the first non-missing value (a true or false question)
                3 the first non-missing value.

                There aren't any missing values in the data example, so the first line of attack below works with that.

                When there are missing values, you need something a little more subtle. The second line of attack below is one possibility.

                This is also an FAQ. https://www.stata.com/support/faqs/d...t-occurrences/

                I can't tell how many times people find a solution in the FAQs and thus never need to post, but the FAQs are evidently an under-used resource.

                See also Section 9 of https://www.stata-journal.com/articl...article=dm0055

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input str6 gvkey double fyear float bklevg
                "001000" 1965   .6294372
                "001000" 1966   .5074074
                "001000" 1967  .56229645
                "001000" 1968  .25954068
                "001000" 1969   .3045417
                "001000" 1970   .3974589
                "001000" 1971   .4491988
                "001000" 1972   .3516351
                "001000" 1973   .3215286
                "001000" 1974  .29253453
                "001000" 1975  .24442585
                "001000" 1976   .3610895
                "001000" 1977   .4236002
                "001001" 1983   .3454545
                "001001" 1984   .2937235
                "001001" 1985   .5125712
                "001002" 1968   .4297259
                "001002" 1969   .4234216
                "001002" 1970   .3232765
                "001002" 1971  .13577497
                "001002" 1972  .22008166
                "001003" 1982   .2224787
                "001003" 1983  .14069645
                "001003" 1984  .11527727
                "001003" 1985  .33595425
                "001003" 1986   .2579871
                "001003" 1987   .3433487
                "001003" 1988   .4748157
                "001003" 1989   .4476209
                "001004" 1965  .19849147
                "001004" 1966   .1651492
                "001004" 1967  .22930327
                "001004" 1968  .34810615
                "001004" 1969    .335549
                "001004" 1970   .3425156
                "001004" 1971  .31864735
                "001004" 1972  .28373775
                "001004" 1973   .3468845
                "001004" 1974   .5356954
                "001004" 1975   .4737132
                "001004" 1976   .4840834
                "001004" 1977    .474585
                "001004" 1978   .4885971
                "001004" 1979  .46216005
                "001004" 1980    .467108
                "001004" 1981   .3832191
                "001004" 1982   .3120642
                "001004" 1983  .11526802
                "001004" 1984  .18887423
                "001004" 1985   .2869578
                "001004" 1986   .2054056
                "001004" 1987   .2549553
                "001004" 1988  .26930532
                "001004" 1989  .27321562
                "001004" 1990   .2249012
                "001004" 1991  .23353426
                "001004" 1992  .25009653
                "001004" 1993  .27847165
                "001004" 1994  .28509632
                "001004" 1995  .27353454
                "001004" 1996   .2233678
                "001004" 1997   .2650714
                "001004" 1998   .2495892
                "001004" 1999  .27903044
                "001004" 2000  .27589643
                "001004" 2001     .36641
                "001004" 2002   .3741715
                "001004" 2003   .3553656
                "001005" 1974   .2303457
                "001005" 1975   .2177684
                "001005" 1976  .16471837
                "001005" 1977  .18459633
                "001005" 1978  .30025125
                "001005" 1979   .3223435
                "001005" 1980  .14247936
                "001005" 1981  .29187092
                "001006" 1975    .083391
                "001006" 1976  .04948522
                "001006" 1977 .015176152
                "001006" 1978 .013190437
                "001006" 1979 .008068583
                "001006" 1980 .036235955
                "001006" 1981 .005572441
                "001006" 1982  .08350141
                "001007" 1976 .006148089
                "001007" 1977  .08268563
                "001008" 1983  .05714286
                "001008" 1984   .1839671
                "001008" 1985  .03626943
                "001009" 1981   .5390736
                "001009" 1982   .6598155
                "001009" 1983   .8872525
                "001009" 1984   .7358214
                "001009" 1985   .5620438
                "001009" 1986   .6327684
                "001009" 1987   .6534248
                "001009" 1988   .4867115
                "001009" 1989    .633677
                "001009" 1990    .648647
                "001009" 1991   .5955454
                end
                
                bysort gvkey (fyear) : gen first1 = bklevg[1]
                
                by gvkey : gen whenfirst1 = fyear[1]
                
                by gvkey : egen whenfirst2 = min(cond(!missing(bklevg), fyear, .)) 
                
                by gvkey : egen first2 = mean(cond(fyear == whenfirst2, bklevg, .)) 
                
                tabdisp gvkey, c(whenfirst1 first1  whenfirst2 first2) 
                
                ----------------------------------------------------------
                    gvkey | whenfirst1      first1  whenfirst2      first2
                ----------+-----------------------------------------------
                   001000 |       1965    .6294372        1965    .6294372
                   001001 |       1983    .3454545        1983    .3454545
                   001002 |       1968    .4297259        1968    .4297259
                   001003 |       1982    .2224787        1982    .2224787
                   001004 |       1965    .1984915        1965    .1984915
                   001005 |       1974    .2303457        1974    .2303457
                   001006 |       1975     .083391        1975     .083391
                   001007 |       1976    .0061481        1976    .0061481
                   001008 |       1983    .0571429        1983    .0571429
                   001009 |       1981    .5390736        1981    .5390736
                ----------------------------------------------------------

                Comment


                • #9
                  Nick Cox for what it's worth, the FAQ as luck would have it is RIDICULOUSLY helpful on my end.

                  Whenever I need to remind myself of the reshape syntax or use a bit of code that I've used before but don't quite have on hand, FAQ usually helps me a lot.

                  Comment

                  Working...
                  X