Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Survival Analysis (Market Shares Computation)

    Hello everyone,

    I am in desperate need for help because I am stuck.

    I have data for individual airplanes, the manufacturers of the airplanes, the build years, the submarkets (small, medium, large size airplanes) as well as the survival of an airplane ( 1 if the airplane is in use and 0 if it is retired).
    I want to calculate the market share of each manufacturer on a annual basis given as: (Number of Airplanes of Manufacturer i in submarket j)/(Total Number of Airplanes in submarket j).


    clear
    input str32 Type str20 SerialNumber str42 Manufacturer int BuildYear str6 Submarket byte Survival
    "A320" "20230" "Airbus" 1969 "Heavy" 1
    "707" "19000" "Hawker Beechcraft Corp" 1965 "Heavy" 0
    "707" "20518" "Boeing" 1972 "Large" 1
    "707" "20519" "Boeing" 1972 "Heavy" 1
    "E175" "21250" "Bombardier (Canadair)" 1977 "Heavy" 1
    "707" "21207" "Boeing" 1976 "Small" 1
    "707" "21208" "Boeing" 1984 "Heavy" 0
    "Il-96" "21209" "Fokker" 1976 "Heavy" 1
    "707" "21434" "Boeing" 1977 "Medium" 1
    "A320" "21047" "Airbus" 1975 "Heavy" 0
    "707" "21435" "Boeing" 1977 "Large" 1
    "707" "21436" "Boeing" 1984"Heavy" 1
    end



    I just can not understand how to sort my data.

    Any help is welcome!!

  • #2
    The reason you cannot figure out how to sort our data is that what you want cannot be done with this data. For those planes which are no longer in use (Survival = 0) there is no information about what year it went out of use. So all we can say, for example, about A320 Airbus #210047 is that it was built in 1975, and is no longer in use in 2021. But was it around in 2020? in 2010? in 1982? None of those questions can be answered, so you cannot get annualized information from this data.

    Comment


    • #3
      Is there retiring year of each retired airplane? The construction of annual market shares needs this information.

      Crossed with #2.

      Comment


      • #4
        Thank you very much. I do have this information yes. The variable Age refers to the airplanes that are in use now, and the variable AgeatRetirement to those that are retired.


        clear
        input str32 Type str20 SerialNumber str42 Manufacturer double(Age AgeatRetirement) int(StatusChangeDate BuildYear) str6 Submarket byte Survival
        "MD-11" "48798" "Boeing (McDonnell-Douglas)" 18.1 . 14483 1999 "Heavy" 1
        "DC-8" "45264" "Boeing (McDonnell-Douglas)" . 24.4 9131 1960 "Small" 0
        "L-1011 TriStar" "1046" "Lockheed Aircraft Corp" . 22.7 13300 1973 "Heavy" 0
        "A330" "QTR-A330-78961" "Airbus" . . 19679 2019 "Medium" 1
        "787" "BER38756" "Boeing" . . 17335 2018 "Heavy" 1
        "C-135" "17811" "Boeing" 58.2 . -198 1959 "Heavy" 1
        "767" "24086" "Boeing" 28.8 . 21021 1988 "Small" 1
        "747" "23815" "Boeing" . 25.4 19984 1989 "Heavy" 0
        "C-17" "P-24" "Boeing (McDonnell-Douglas)" 21.6 . 13174 1996 "Heavy" 1
        "777" "61736" "Boeing" .9 . 20768 2016 "Heavy" 1
        "747" "26559" "Boeing" 16 . 20552 2001 "Heavy" 1
        "C-141" "6230" "Lockheed Aircraft Corp" . 38 16617 1967 "Large" 0
        "777" "27038" "Boeing" 12.5 . 16551 2005 "Large" 1
        "A330" "1693" "Airbus" 1.8 . 20437 2015 "Heavy" 1
        "777" "QTR-B777-78896" "Boeing" . . 19920 2026 "Large" 1
        "A350" "UAL-A350-109313" "Airbus" . . 21068 2027 "Heavy" 1
        "A380" "BAW38018" "Airbus" . . 17524 2020 "Heavy" 1
        "A330" "113" "Airbus" 22 . 13673 1995 "Heavy" 1
        "747" "19660" "Boeing" . 31.6 15350 1970 "Heavy" 0
        "A330" "889" "Airbus" 9.8 . 17519 2007 "Heavy" 1
        "A330" "328" "Airbus" 17.5 . 20850 2000 "Heavy" 1
        "767" "33048" "Boeing" 14.9 . 15690 2002 "Heavy" 1
        "747" "21831" "Boeing" . 25.2 16608 1980 "Medium" 0
        "A380" "130" "Airbus" 3.6 . 19901 2014 "Heavy" 1
        "747" "20332" "Boeing" . 38.7 18423 1971 "Heavy" 0
        "Il-76" "0073476281" "United Aircraft Corporation (Ilyushin)" . 20.7 17546 1987 "Heavy" 0
        "777" "29324" "Boeing" 18.9 . 20788 1998 "Heavy" 1
        "A300" "305" "Airbus" . 20.2 16512 1984 "Small" 0
        "A300" "794" "Airbus" 18.9 . 14200 1998 "Heavy" 1
        "C-135" "18149" "Boeing" . 39.1 14917 1961 "Heavy" 0
        end
        format %td StatusChangeDate









        Last edited by Bill Vasileiou; 12 Dec 2021, 04:08.

        Comment


        • #5
          I made a few assumptions about your data. First, SerialNumber is the unique ID of airplanes. Second, the current data were collected in 2017, as ages of all working airplanes seem to be the ages in 2017. Third, the range of years based on which annual market shares are constructed is from the earliest building year to 2017. A couple of planes seem to be built after 2017, and they won't be counted, because we don't know when the working planes will retire after 2017. Fourth, a plane is counted in the building year and retiring year (even though it was built near the end of year or retired close to the beginning of year). Fifth, the StatusChangeDate seems to be the retiring date of the retired planes. The code would be

          Code:
          sum BuildYear
          expand 2017-r(min)+1
          bys SerialNumber: gen year = r(min)-1+_n
          
          gen working = cond(Survival, year >= BuildYear, year >= BuildYear & year <= year(StatusChangeDate))
          
          bys year Submarket: egen total = total(working)
          bys year Submarket Manufacturer: egen total_m = total(working)
          gen share = total_m / total
          I show below the market share in 2010. You can see that three manufactures shared the Heavy submarket, and Boeing monopolized the large and small markets. No one appeared in the Medium market that year.

          Code:
          . collapse share, by(year Submarket Manufacturer)
          . drop if share == 0 | share == .
          (403 observations deleted)
          . list if year == 2010
          
               +---------------------------------------------------------+
               |               Manufacturer   Submar~t   year      share |
               |---------------------------------------------------------|
          266. |                     Airbus      Heavy   2010   .3333333 |
          267. |                     Boeing      Heavy   2010         .5 |
          268. | Boeing (McDonnell-Douglas)      Heavy   2010   .1666667 |
          269. |                     Boeing      Large   2010          1 |
          270. |                     Boeing      Small   2010          1 |
               +---------------------------------------------------------+
          Last edited by Fei Wang; 12 Dec 2021, 05:11.

          Comment


          • #6
            Thank you very much Fei! Your code was really useful!

            I have an extra question regarding a variable I am trying to construct and I am really puzzled...
            This variable refers to the future competition and will be calculated as follows: (Number of Airplanes of Manufacturer i in submarket j in period t+1)/(Number of my Competitors Airplanes in submarket j in t+1).

            Any ideas?

            Comment


            • #7
              I don't think the competition index is well defined. If Boeing monopolized a submarket in period t+1, how large is the competition index (Boeing has no competitors)?

              Comment


              • #8
                My idea is that the variable will take values in the interval between 0 and 1, so if Boeing monopolises a particular segment then the index would get the value of 1.

                Comment


                • #9
                  Originally posted by Bill Vasileiou View Post
                  My idea is that the variable will take values in the interval between 0 and 1, so if Boeing monopolises a particular segment then the index would get the value of 1.
                  Well, the formula in #6 cannot guarantee that the index is between 0 and 1 (e.g., Boeing has 10 planes and its competitors have 6 planes in total, and then the index would be 10/6). You may need to rethink of a formula that measures what you intend to measure.

                  Comment


                  • #10
                    1. You are right Fei. I re-examined the formula and I think it is better to define the variable as: (Number of Airplanes of Manufacturer i in submarket j in period t+1)/(Total Number of Airplanes in submarket j in period t+1).
                    This is the same as Market Share, but I want to compute it in t+1.

                    2. I have an extra minor question which sounds me difficult cause I am new in survival analysis: How can I stset my data? I know that the general formula is stset time, failure(censor).
                    The failure variable in my dataset is Survival. I have problem with the time, any idea of how to define it?
                    Last edited by Bill Vasileiou; 13 Dec 2021, 04:31.

                    Comment


                    • #11
                      For the new definition, the code below generates share and competition (share in t+1) for every possible pair of manufacturer and submarket in every year.

                      Code:
                      sum BuildYear
                      expand 2017-r(min)+1
                      bys SerialNumber: gen year = r(min)-1+_n
                      
                      gen working = cond(Survival, year >= BuildYear, year >= BuildYear & year <= year(StatusChangeDate))
                      
                      bys year Submarket: egen total = total(working)
                      bys year Submarket Manufacturer: egen total_m = total(working)
                      gen share = total_m / total
                      
                      collapse share total, by(year Submarket Manufacturer)
                      
                      fillin year Submarket Manufacturer
                      
                      bys year Submarket (total): replace total = total[1]
                      replace share = 0 if mi(share) & total > 0
                      
                      bys Manufacturer Submarket (year): gen competition = share[_n+1]
                      Your second question is not a minor one. I would suggest opening a new thread where you specify in details what question you plan to examine with survival analysis. My intuition tells that you should generate a "retire" variable (1-Survival), and use variable "Age" to store the age of each plane (current age of working planes and eventual age of retired planes). Then "stset Age, failure(retire)". But ultimately that depends on your research question.

                      Comment


                      • #12
                        Thank you very much Fei!

                        Comment

                        Working...
                        X