Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating

    Dear All

    *I have sent this message to someone's inbox before i red the posting rules, so I am correcting a mistake I made.

    I am a new to Stata, I am struggling to calculate industry sales growth. I know how to calculate each firm sales growth using, " generate laggedsales =sales[_n-1] if Code==Code[_n-1] ". then Sales-Lagged sales.

    I have tried to read some previous posts on calculating industry sales growth, posted here but i failed to understand.

    Below is a data extract from a 10 year unbalanced data which has 9809 observations i intend to use.

    Kindly assist


    firm year sales industrycode
    000892 2012 796438.2 I63
    000892 2011 2062352 I63
    000892 2015 4427383 I63
    900957 2013 6379161 K70
    000409 2011 8076227 C35
    900957 2012 9840413 K70
    000613 2017 1.03e+07 H61
    000035 2013 1.50e+07 C39
    000409 2008 1.53e+07 C35
    300235 2011 1.56e+07 I65
    002148 2009 1.87e+07 I63
    002148 2008 1.94e+07 I63
    300085 2010 2.10e+07 I65
    300235 2012 2.13e+07 I65
    300288 2012 2.25e+07 I63
    300282 2011 2.37e+07 C39
    002148 2010 2.40e+07 I63
    002148 2011 2.56e+07 I63
    300017 2009 2.60e+07 I63
    600763 2008 2.69e+07 Q83
    300075 2010 2.73e+07 I65
    300085 2012 2.74e+07 I65
    300235 2013 2.74e+07 I65
    300085 2011 2.76e+07 I65
    300113 2010 2.88e+07 I64
    002502 2010 2.88e+07 C24
    300023 2010 2.91e+07 C35
    600763 2009 2.92e+07 Q83
    300282 2012 2.94e+07 C39
    000613 2014 2.97e+07 H61
    300085 2013 2.97e+07 I65
    600083 2008 2.99e+07 C39
    300348 2012 3.05e+07 I65
    300074 2010 3.05e+07 I63
    300084 2010 3.13e+07 B11
    000613 2012 3.17e+07 H61
    002502 2011 3.18e+07 C24
    000613 2010 3.21e+07 H61
    300399 2014 3.25e+07 I65
    002558 2011 3.28e+07 L72
    300348 2013 3.33e+07 I65
    300071 2010 3.36e+07 I65
    300075 2011 3.41e+07 I65
    002558 2012 3.49e+07 N78
    000536 2009 3.50e+07 C38
    300235 2014 3.51e+07 I65
    002279 2009 3.60e+07 I65
    300133 2010 3.70e+07 R86
    300023 2009 3.73e+07 C35
    300148 2010 3.75e+07 R85

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 firm float(year sales) str3 industrycode
    "000035" 2013 1.50e+07 "C39"
    "000409" 2008 1.53e+07 "C35"
    "000409" 2011  8076227 "C35"
    "000536" 2009 3.50e+07 "C38"
    "000613" 2010 3.21e+07 "H61"
    "000613" 2012 3.17e+07 "H61"
    "000613" 2014 2.97e+07 "H61"
    "000613" 2017 1.03e+07 "H61"
    "000892" 2011  2062352 "I63"
    "000892" 2012 796438.2 "I63"
    "000892" 2015  4427383 "I63"
    "002148" 2008 1.94e+07 "I63"
    "002148" 2009 1.87e+07 "I63"
    "002148" 2010 2.40e+07 "I63"
    "002148" 2011 2.56e+07 "I63"
    "002279" 2009 3.60e+07 "I65"
    "002502" 2010 2.88e+07 "C24"
    "002502" 2011 3.18e+07 "C24"
    "002558" 2011 3.28e+07 "L72"
    "002558" 2012 3.49e+07 "N78"
    "300017" 2009 2.60e+07 "I63"
    "300023" 2009 3.73e+07 "C35"
    "300023" 2010 2.91e+07 "C35"
    "300071" 2010 3.36e+07 "I65"
    "300074" 2010 3.05e+07 "I63"
    "300075" 2010 2.73e+07 "I65"
    "300075" 2011 3.41e+07 "I65"
    "300084" 2010 3.13e+07 "B11"
    "300085" 2010 2.10e+07 "I65"
    "300085" 2011 2.76e+07 "I65"
    "300085" 2012 2.74e+07 "I65"
    "300085" 2013 2.97e+07 "I65"
    "300113" 2010 2.88e+07 "I64"
    "300133" 2010 3.70e+07 "R86"
    "300148" 2010 3.75e+07 "R85"
    "300235" 2011 1.56e+07 "I65"
    "300235" 2012 2.13e+07 "I65"
    "300235" 2013 2.74e+07 "I65"
    "300235" 2014 3.51e+07 "I65"
    "300282" 2011 2.37e+07 "C39"
    "300282" 2012 2.94e+07 "C39"
    "300288" 2012 2.25e+07 "I63"
    "300348" 2012 3.05e+07 "I65"
    "300348" 2013 3.33e+07 "I65"
    "300399" 2014 3.25e+07 "I65"
    "600083" 2008 2.99e+07 "C39"
    "600763" 2008 2.69e+07 "Q83"
    "600763" 2009 2.92e+07 "Q83"
    "900957" 2012  9840413 "K70"
    "900957" 2013  6379161 "K70"
    end
    
    encode firm, gen(firmno)
    xtset firmno year
    bysort firmno(year): gen laggedsales=L1.sales
    encode, or alternatively egen group, can be used to create a variable so that we can xtset the data. For more on xtset see: https://www.stata.com/manuals13/xtxtset.pdf
    The good thing about using xtset data is that Stata recognizes that you have gaps in your data. With the code you suggested, you would be trying to look up the previous value of sales, without regard to the gaps in years. With xtset data, Stata takes years into account: if a sales value for the previous year exists, it uses that to calculate lagged values. If no value for the previous year exists, it sets the lagged value to missing. With your approach you could end up using the value of 2008 as a laggedsales value for the year 2011, where a firm has observations missing for years 2009 & 2010.

    With future questions, please post a data example using dataex. This makes it a lot easier for people on the forum to copy your data into Stata. see also the explanation on how and why in the FAQ (http://www.statalist.org/forums/help#stata).
    Code:
    ssc install dataex
    dataex

    Comment


    • #3
      Originally posted by Jorrit Gosens View Post
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str6 firm float(year sales) str3 industrycode
      "000035" 2013 1.50e+07 "C39"
      "000409" 2008 1.53e+07 "C35"
      "000409" 2011 8076227 "C35"
      "000536" 2009 3.50e+07 "C38"
      "000613" 2010 3.21e+07 "H61"
      "000613" 2012 3.17e+07 "H61"
      "000613" 2014 2.97e+07 "H61"
      "000613" 2017 1.03e+07 "H61"
      "000892" 2011 2062352 "I63"
      "000892" 2012 796438.2 "I63"
      "000892" 2015 4427383 "I63"
      "002148" 2008 1.94e+07 "I63"
      "002148" 2009 1.87e+07 "I63"
      "002148" 2010 2.40e+07 "I63"
      "002148" 2011 2.56e+07 "I63"
      "002279" 2009 3.60e+07 "I65"
      "002502" 2010 2.88e+07 "C24"
      "002502" 2011 3.18e+07 "C24"
      "002558" 2011 3.28e+07 "L72"
      "002558" 2012 3.49e+07 "N78"
      "300017" 2009 2.60e+07 "I63"
      "300023" 2009 3.73e+07 "C35"
      "300023" 2010 2.91e+07 "C35"
      "300071" 2010 3.36e+07 "I65"
      "300074" 2010 3.05e+07 "I63"
      "300075" 2010 2.73e+07 "I65"
      "300075" 2011 3.41e+07 "I65"
      "300084" 2010 3.13e+07 "B11"
      "300085" 2010 2.10e+07 "I65"
      "300085" 2011 2.76e+07 "I65"
      "300085" 2012 2.74e+07 "I65"
      "300085" 2013 2.97e+07 "I65"
      "300113" 2010 2.88e+07 "I64"
      "300133" 2010 3.70e+07 "R86"
      "300148" 2010 3.75e+07 "R85"
      "300235" 2011 1.56e+07 "I65"
      "300235" 2012 2.13e+07 "I65"
      "300235" 2013 2.74e+07 "I65"
      "300235" 2014 3.51e+07 "I65"
      "300282" 2011 2.37e+07 "C39"
      "300282" 2012 2.94e+07 "C39"
      "300288" 2012 2.25e+07 "I63"
      "300348" 2012 3.05e+07 "I65"
      "300348" 2013 3.33e+07 "I65"
      "300399" 2014 3.25e+07 "I65"
      "600083" 2008 2.99e+07 "C39"
      "600763" 2008 2.69e+07 "Q83"
      "600763" 2009 2.92e+07 "Q83"
      "900957" 2012 9840413 "K70"
      "900957" 2013 6379161 "K70"
      end
      
      encode firm, gen(firmno)
      xtset firmno year
      bysort firmno(year): gen laggedsales=L1.sales
      encode, or alternatively egen group, can be used to create a variable so that we can xtset the data. For more on xtset see: https://www.stata.com/manuals13/xtxtset.pdf
      The good thing about using xtset data is that Stata recognizes that you have gaps in your data. With the code you suggested, you would be trying to look up the previous value of sales, without regard to the gaps in years. With xtset data, Stata takes years into account: if a sales value for the previous year exists, it uses that to calculate lagged values. If no value for the previous year exists, it sets the lagged value to missing. With your approach you could end up using the value of 2008 as a laggedsales value for the year 2011, where a firm has observations missing for years 2009 & 2010.

      With future questions, please post a data example using dataex. This makes it a lot easier for people on the forum to copy your data into Stata. see also the explanation on how and why in the FAQ (http://www.statalist.org/forums/help#stata).
      Code:
      ssc install dataex
      dataex
      Thanks Jorrit for reply.

      I have installed dataex an in future I will utilise it.

      I have declared data as a panel, using xtset command. My main problem is on calculating INDUSTRY SALES GROWTH.

      Comment


      • #4
        If you already know how to use xtset, then you should also familiarize yourself with the lag/lead operators. Much better approach than the code you suggested in post #1.

        For this issue, you can do:

        Code:
        encode firm, gen(firmno)
        xtset firmno year
        bysort firmno(year): gen laggedsales=L1.sales
        
        bys industrycode year: egen ind_sales=total(sales)
        gen ind_sales_templag = ind_sales[_n-1] if industrycode==industrycode[_n-1] & year==year[_n-1]+1
        bys industrycode year: egen ind_sales_lag = max(ind_sales_templag) 
        drop ind_sales_templag
        You should also consider collapsing, if your analysis is going to be on the industry level. You could even save this dataset created below and merge values per idustry with the original firm-level
        Code:
        dataset
        collapse (sum)sales, by(industrycode year)
        encode industrycode, gen(indno)
        xtset indno year
        bysort indno(year): gen laggedsales=L1.sales

        Comment


        • #5
          Originally posted by Jorrit Gosens View Post
          If you already know how to use xtset, then you should also familiarize yourself with the lag/lead operators. Much better approach than the code you suggested in post #1.

          For this issue, you can do:

          Code:
          encode firm, gen(firmno)
          xtset firmno year
          bysort firmno(year): gen laggedsales=L1.sales
          
          bys industrycode year: egen ind_sales=total(sales)
          gen ind_sales_templag = ind_sales[_n-1] if industrycode==industrycode[_n-1] & year==year[_n-1]+1
          bys industrycode year: egen ind_sales_lag = max(ind_sales_templag)
          drop ind_sales_templag
          You should also consider collapsing, if your analysis is going to be on the industry level. You could even save this dataset created below and merge values per idustry with the original firm-level
          Code:
          dataset
          collapse (sum)sales, by(industrycode year)
          encode industrycode, gen(indno)
          xtset indno year
          bysort indno(year): gen laggedsales=L1.sales
          Thanks Jorrit, problem solved.

          However among the commands you sent, the one below, is giving me this error "factor variables and time-series operators not allowed
          r(101);"
          bysort indno(year): gen laggedsales=L1.sales
          I am also not sure what that command does.

          Please forgive me, I haven't yet mastered how to use dataex.

          Comment


          • #6
            I am not sure what is going wrong. The code runs fine on my machine and generate shouldn't have any problems with time-series operators.
            Alternatively, you can do:
            Code:
            collapse (sum)sales, by(industrycode year)
            encode industrycode, gen(indno)
            xtset indno year
            gen ind_sales_lag = sales[_n-1] if industrycode==industrycode[_n-1] & year==year[_n-1]+1
            As for dataex, the use is described in the FAQ: http://www.statalist.org/forums/help#stata.
            You could just run:
            Code:
            ssc install dataex
            dataex in 1/20
            to get an idea

            The line
            Code:
            bysort indno(year): gen laggedsales=L1.sales
            creates a lagged value of sales by using a time-series operator (L1. or a lag of 1): for explanation see page four of: https://www.stata.com/manuals13/tstsset.pdf

            Comment


            • #7
              Originally posted by Jorrit Gosens View Post
              I am not sure what is going wrong. The code runs fine on my machine and generate shouldn't have any problems with time-series operators.
              Alternatively, you can do:
              Code:
              collapse (sum)sales, by(industrycode year)
              encode industrycode, gen(indno)
              xtset indno year
              gen ind_sales_lag = sales[_n-1] if industrycode==industrycode[_n-1] & year==year[_n-1]+1
              As for dataex, the use is described in the FAQ: http://www.statalist.org/forums/help#stata.
              You could just run:
              Code:
              ssc install dataex
              dataex in 1/20
              to get an idea

              The line
              Code:
              bysort indno(year): gen laggedsales=L1.sales
              creates a lagged value of sales by using a time-series operator (L1. or a lag of 1): for explanation see page four of: https://www.stata.com/manuals13/tstsset.pdf

              Thank you Jorrit.

              Be blessed.

              Comment

              Working...
              X