Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to make groups of non-missing observations

    Dear Stata Users,

    Please, help me with the following issue. I want to create a variable "group" that will unite a firm ("gvkey") and "cons_year" so that there are no missing values of "cons_year". For the data below what I expect is as follows:

    Expected Result:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey double fyear float(p_v_decile cons_year group)
    "001009" 1992 3 . .
    "001009" 1994 . . .
    "001013" 1992 4 . .
    "001013" 1993 3 . .
    "001013" 1994 5 1 1
    "001013" 1995 5 2 1
    "001013" 1996 5 3 1
    "001013" 1997 4 . .
    "001013" 1998 5 1 2
    "001013" 1999 5 2 2
    "001013" 2000 5 3 2
    "001013" 2001 3 . .
    "001013" 2002 1 . .
    "001013" 2003 4 . .
    "001013" 2004 5 1 3
    "001013" 2005 5 2 3
    "001013" 2006 2 . .
    "001013" 2007 2 . .
    "001013" 2008 2 . .
    "001013" 2009 2 . .
    "001013" 2010 3 . .
    "001034" 1992 4 . .
    "001034" 1993 4 . .
    "001034" 1995 . . .
    "001034" 1997 . . .
    "001034" 1998 . . .
    "001034" 1999 5 1 4
    "001034" 2000 5 2 4
    end


    What I have now:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 gvkey double fyear float(p_v_decile cons_year)
    "001009" 1992 3 .
    "001009" 1994 . .
    "001013" 1992 4 .
    "001013" 1993 3 .
    "001013" 1994 5 1
    "001013" 1995 5 2
    "001013" 1996 5 3
    "001013" 1997 4 .
    "001013" 1998 5 1
    "001013" 1999 5 2
    "001013" 2000 5 3
    "001013" 2001 3 .
    "001013" 2002 1 .
    "001013" 2003 4 .
    "001013" 2004 5 1
    "001013" 2005 5 2
    "001013" 2006 2 .
    "001013" 2007 2 .
    "001013" 2008 2 .
    "001013" 2009 2 .
    "001013" 2010 3 .
    "001034" 1992 4 .
    "001034" 1993 4 .
    "001034" 1995 . .
    "001034" 1997 . .
    "001034" 1998 . .
    "001034" 1999 5 1
    "001034" 2000 5 2
    end

  • #2
    If the order is correct, just sum and replace

    Code:
    gen group= sum(cond( cons_year==1, 1, .))
    replace group=. if cons_year==.

    Comment


    • #3
      I guess below solutions serve for your explanation in #1.
      Code:
      gen group1=sum(p_v_decile!= p_v_decile[_n-1]) if p_v_decile==5
      or
      Code:
      gen group2=sum(cons_year==1) if cons_year !=.
      But I also guess group3 below might be closer with what you think about.
      Code:
      bys gvkey (fyear): gen group3=sum(cons_year==1) if cons_year !=.

      Comment

      Working...
      X