Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable for panel data

    Hello Statalist members,

    my research is about venture capital (VC) firms and their investments in entrepreneurial or start-up companies. my data is panel data.

    I need to create a variable to measure the size of a VC firm in a certain year (variable: VC size in year x) using values from the previous funds they raise over the years (variable: fund size, variable fund year).

    I am trying to create this variable "VC size in year x = sum of all fund sizes in previous years up to year x."

    However, the data is unbalanced and not consistent. For example, some VC firms raised only 1 fund so they have only a 1 year observation for VC size in year x. Moreover, some VC firms have multiple funds in the same year.

    Is there a code that helps me to create this variable?

    This is how my data looks now
    Firm Name Fund
    Year
    Fund Size
    (Mil)
    VC A 1995 60.0
    VC A 1999 150.0
    VC A 2013 30.0
    VC A 2016 49.0
    VC B 2021 60.0
    VC C 2019 17.6
    VC D 2000 15.3
    VC D 2011 15.0
    VC D 2014 22.2
    VC D 2014 85.0
    VC D 2015 33.2

    This is what I think the data should look like before the analysis:
    Firm Name Fund
    Year
    Fund Size
    (Mil)
    year x VC size in a year x
    VC A 1995 60.0 1995 60.00
    VC A 1999 150.0 1999 210.00
    VC A 2013 30.0 2013 240.00
    VC A 2016 49.0 2016 289.00
    VC B 2021 60.0 2021 60.00
    VC C 2019 17.6 2019 17.60
    VC D 2000 15.3 2000 15.30
    VC D 2011 15.0 2011 30.30
    VC D 2014 22.2 2014 137.50
    VC D 2014 85.0 2014 137.50
    VC D 2015 33.2 2015 155.40
    Can I create this variable using state codes or commands?

    I really appreciate your guidance.


  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str5 firm_name int year float fund_size
    "VC A " 1995   60
    "VC A " 1999  150
    "VC A " 2013   30
    "VC A " 2016   49
    "VC B " 2021   60
    "VC C " 2019 17.6
    "VC D " 2000 15.3
    "VC D " 2011   15
    "VC D " 2014 22.2
    "VC D " 2014   85
    "VC D " 2015 33.2
    end
    
    by firm_name year, sort: egen funds_raised = total(fund_size)
    by firm_name year: replace funds_raised = . if _n > 1
    by firm_name (year): gen wanted = sum(funds_raised)
    I believe there is an error in your example for VC D in year 2015. In 2014 it had a total of 137.50, and then in 2015 it adds a additional 33.2. So the total cannot be 155.40.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Dear Clyde,
      Thank you very much for your help. I am still a beginner with Stata, and I am new to this Stata forum.

      I will do your suggestions now and I will use -dataex- command in anty future posts.

      Comment


      • #4
        Welcome to Statalist!

        Code:
        clear
        input str5 firm year fund
        "VC A"     1995     60.0
        "VC A"     1999     150.0
        "VC A"     2013     30.0
        "VC A"     2016     49.0
        "VC B"     2021     60.0
        "VC C"     2019     17.6
        "VC D"     2000     15.3
        "VC D"     2011     15.0
        "VC D"     2014     22.2
        "VC D"     2015     33.2
        "VC D"     2014     85.0
        end
        
        bysort firm (year): gen temp01 = fund if _n == 1
        bysort firm (year): replace temp01 = fund + temp01[_n - 1] if temp01 == .
        bysort firm year: egen wanted = max(temp01)
        The 155.40 for VC D in 2015 should be 170.7, correct?

        In future, please refer to the FAQ (http://www.statalist.org/forums/help) and use dataex to present your data example in code form rather than table form. That'd save us some time to recreate the data. For instance, if this has been done, then the first 14 lines in my code could have been saved.

        Comment


        • #5
          Dear Ken,
          Thank you for your help. Your codes worked and created the variable.

          How I can drop observations with multiple years now so I can have only 1 yearly observation for the variable VC_size (you called it wanted)
          I mean I want the data with 1 observation for each year.

          ----------------------- copy starting from the next line -----------------------
          [CODE]
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str59 FirmName int FundYear double fund_size float(temp01 VC_size)
          "10 Point Capital LLC" 2021 264.7 264.7 264.7
          "100 X Better Inc" 2021 124 124 124
          "10T Holdings LLC" 2020 20 20 20
          "10X Capital Management LLC" 2020 5.7 5.7 5.7
          "10X Venture Partners LLC" 2020 38 38 38
          "12/12 Ventures LLC" 2018 300 300 300
          "122 West Ventures LP" 2015 200 200 200
          "1315 Capital LLC" 2014 138 138 138
          "1315 Capital LLC" 2018 250 388 388
          "137 Ventures LP" 2011 50 50 50
          "137 Ventures LP" 2015 6.4 56.4 56.4
          "137 Ventures LP" 2021 2 58.4 58.4
          "1414 Ventures Fund LLC" 2019 19 19 19
          "1517 Fund I LP" 2015 12.5 12.5 12.5
          "1776 Seed Investors LP" 2004 3 3 3
          "18.ventures" 1993 475 475 475
          "180 Capital Fund LLC" 1998 530 530 530
          "180 Degree Capital Corp" 1989 325 325 325
          "1843 Capital LLC" 1997 250 250 250
          "1984 Ventures LP" 2016 200 200 200
          "1ST Interstate Capital Corp" 2019 45 45 45
          "1confirmation Global LLC" 2017 42.3 42.3 42.3
          "1st Course Capital Management LLC" 1984 .2 .2 .2
          "1st Source Capital Corp" 2017 .4 .4 .4
          "1st Source Capital Corp" 2019 3.6 4 4
          "2020 Ventures LLC" 2019 27 27 27
          "2048 Ventures LLC" 2022 67 67 67
          "21 Ventures LLC" 1999 99.1 99.1 99.1
          "212 Ventures" 1995 243.5 243.5 243.5
          "25Madison LLC" 1996 55.8 55.8 55.8
          "2M Technology Group, L.C." 2001 60 60 60
          "2PM Inc" 1999 50 50 50
          "3 Rivers Capital" 2007 1.3 1.3 1.3
          "3COM VENTURES(3COM CORP)" 1996 100 100 100
          "3L Capital Management LLC" 2018 211.6 211.6 211.6
          "3Lines LLC" 2018 10 10 11.2
          "3Lines LLC" 2018 1.2 11.2 11.2
          "3M Corporation" 2019 5 5 5
          "3dfx Interactive Inc" 2000 250 250 250
          "3se Holdings LP" 2021 .5 .5 .5
          "4 Good Ventures" 2015 55.9 55.9 55.9
          "4-D Investments LLC" 2021 60 60 60
          "4.0 Partners" 2011 69.5 69.5 69.5
          "412 Venture Fund LP" 2021 11.7 11.7 11.7
          "4C Ventures" 1995 60 60 60
          "4C Ventures" 1999 150 210 210
          "4C Ventures" 2013 30 240 240
          "4C Ventures" 2016 49 289 289
          "4DX Ventures LP" 2019 17.6 17.6 17.6
          "500 Startups Management Company LLC" 2000 15.3 15.3 15.3
          "500 Startups Management Company LLC" 2011 15 30.3 30.3
          "500 Startups Management Company LLC" 2014 22.2 52.5 137.5
          "500 Startups Management Company LLC" 2014 85 137.5 137.5
          "500 Startups Management Company LLC" 2015 6.9 144.4 193
          "500 Startups Management Company LLC" 2015 33.2 177.6 193
          "500 Startups Management Company LLC" 2015 15.4 193 193
          "500 Startups Management Company LLC" 2016 9.5 202.5 257.4
          "500 Startups Management Company LLC" 2016 14.1 216.6 257.4
          "500 Startups Management Company LLC" 2016 26.8 243.4 257.4
          "500 Startups Management Company LLC" 2016 14 257.4 257.4
          "500 Startups Management Company LLC" 2018 12.1 269.5 278.6
          "500 Startups Management Company LLC" 2018 9.1 278.6 278.6
          "500 Startups Management Company LLC" 2019 137.5 416.1 416.1
          "500 Startups Management Company LLC" 2020 17.9 434 434
          "500 Startups Management Company LLC" 2021 140 574 574
          "5280 Partners" 1983 .5 .5 .5
          "5280 Partners" 2012 197.8 198.3 198.3


          Comment


          • #6
            To reduce the data to one observation per firm per year, after running the code in #2 or #4 (they both work the same way--they are notational variants) just run
            [code]
            by firm year: keep if _n == 1

            Comment


            • #7
              I just figured out how to do it
              Many thanks

              Comment

              Working...
              X