Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • treating missing values due to lagged variable creation

    Hello!

    I am currently writing my master thesis on the implementation of GSCMP on financial performance. To evaluate the effect on financial performance one year later I want to lag my independent variable with one year. However, I only want to include observations that have at least two year of consecutive data available, I used the command indicated below. My initial dataset contains 8508 observations over a 10 year time period (2006-2015). I was wondering how to treat these 1436 generated missing values when doing my regression analyses, should I delete them or what is regular procedure when creating lagged variables?

    . xtset ID year
    panel variable: ID (unbalanced)
    time variable: year, 2006 to 2015, but with gaps
    delta: 1 unit

    . by ID: gen L1 = GSCMP[_n-1] if year==year[_n-1]+1
    (1436 missing values generated)

    thank you in advance

  • #2
    Noor:
    welcome to this forum.
    Missing values are created by -L1- machinery and do not require any procedure aimed at dealing with them.
    That said, I would recommend you to discuss with your supervisor the choice of including only -panelid- with 2 or more consecutives waves of data, as missing values might be informative (ie, not at random).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you for your fast reply Carlo! My reasoning behind only including ID's with 2 or more consecutive year observations was to enable evaluating GSCMP on next-years financial performance, by lagging GSCMP. I'm relatively new to Stata, do you think it's not necessary to limit my dataset including companies with at least two subsequent year observations?

      Comment


      • #4
        Noor:
        the main issue here is not Stata, but the fact that you seemingly have an unbalanced panel dataset (by the way, Stata can handle both balanced and unbalanced panel datasets without any problem, so you do not have to worry about that).
        You should exclude that firms that provided data, say, for one year only are different from those that provided data for 2 or more years; if that were the case, missingness would probably informative and you would end up with a dataset that is a non-random sample of the original one.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I have indeed an unbalanced panel dataset to avoid certain biases. Thank you for your comment! I will consider this

          Comment

          Working...
          X