Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • a complicated dummy variable capturing values pertaining to years that are not the year identifier

    Dear All

    I have data that shows firms required to restate "correct" their financial reports by a regulatory for being misstated before.

    The data show that a firm may disclose to the public that it has restated previous years reports in, say 2005, called disclosure_yr in my dataset, but this does not necessarily mean that the 2005 report is a misstated report. The 2005 in my example would be the year when this has been only disclosed. The data show to which years misstatements belong. So in my example, this can be for years from 2001 to 2003, the 2001 year is called begin_year in my dataset and the 2003 is the end_year. All years from 2001 to 2003 were previously misstated. There is a variable called res that is 1 when the firm discloses (in the disclosure_date) that it has restated previously reports (within the being_year to end_year range), and zero otherwise. But the 1 value is assigned in the disclosure_yr

    The data looks like:
    Firm_id disclosure_yr begin_yr end_yr res
    2178 2006 2005 2005 0
    2491 2005 2002 2005 1
    2491 2007 2004 2006 1
    3116 2002 2000 2001 1
    ...... ...... ......... ........ 0
    In the above data, firm id 3116 has disclosed to the public in 2002 that years 2000 and 2001 were previously misstated.

    I am struggling with creating a dummy variable that is :
    equal to 1 for the misstated years (i.e. years between begin_yr and end_yr) , which might be tracked by res=1 in a later disclosure year, and zero otherwise
    In other words, I want to assign a value of 1 for only the years that were previously misreported rather than the year when this issue has been disclosed. Note that res=1 here in the data corresponds with the disclosure-yr.
    Then, I want to sort my data by the firm_id and disclosure_yr to be ready for merge with another data set.

    I know the direct way for creating a dummy such that
    gen dummy=0
    replace dummy=1 if res=1

    However this will not achieve my target because the res values (1/0) are not assigned on the basis of the misstated years, it is only for the disclosure year which I do not want to.


    Any smart ideas ?


  • #2
    Is this at least a start on what you need?
    Code:
    clear
    input Firm_id    disclosure_yr    begin_yr    end_yr    res
    2178    2006    2005    2005    0
    2491    2005    2002    2005    1
    2491    2007    2004    2006    1
    3116    2002    2000    2001    1
    end
    expand end_yr-begin_yr+1
    sort Firm_id disclosure_yr
    by Firm_id disclosure_yr: generate res_yr = begin_yr+_n-1 if res
    list, clean noobs
    Code:
        Firm_id   disclo~r   begin_yr   end_yr   res   res_yr  
           2178       2006       2005     2005     0        .  
           2491       2005       2002     2005     1     2002  
           2491       2005       2002     2005     1     2003  
           2491       2005       2002     2005     1     2004  
           2491       2005       2002     2005     1     2005  
           2491       2007       2004     2006     1     2004  
           2491       2007       2004     2006     1     2005  
           2491       2007       2004     2006     1     2006  
           3116       2002       2000     2001     1     2000  
           3116       2002       2000     2001     1     2001

    Comment


    • #3
      Dear Willam
      Many thanks for your reply.
      I run the code, but it appears to be generating a res_yr that is consistent with all years between begin_yr and end_yr except end_yr (i.e. not inclusive of the last period).
      I want all years between the begin and end including both the begin and end to be also included.
      Do you know how to fix that ?

      Thanks

      Comment


      • #4
        In the example I posted, the code generated all years between begin_yr and end_yr including end_yr. How do your code and data differ from the code and data I posted? Is it possible you omitted the "+1" at the end of the expand command? Can you create and post a reproducible example like mine that fails?

        Comment


        • #5
          I attached the data, and my code is below:

          use AA_Res2000-2014AccFra.dta,clear
          compress
          format file_date %d
          format res_begin_date %d
          format res_end_date %d
          gen file_yr=year(file_date) // this is the year when restatement is disclosed to the public
          gen yrbegin=year( res_begin_date) // this is the first year restated
          gen yrend=year( res_end_date) // this is the last year restated
          rename company_fkey cik

          destring cik,replace

          duplicates tag cik file_yr ,generate(newvariable2)
          drop if newvariable2>0
          drop newvariable2

          expand yrend-yrbegin+1

          sort cik file_yr
          by cik file_yr: generate res_yr = yrbegin+_n-1 if res_accounting

          xtset cik file_yr

          ************************************************** ************************************************** *****
          Now, try to sort by cik and res_yr such that:
          sort cik res_yr

          and you will see clearly that the end year is not incorporated.

          Look forward to hearing from you and all participants
          Attached Files

          Comment


          • #6
            Your problem is that, for example, cik 2491 has had two filings with restatements, and your sort leaves the rows for the two filings intermingled, apparently causing you to overlook the observation with the end year you are looking for. If you replace the xtset in the code you supplied (which fails, and causes the do-file to stop) with the following code, you'll see results like those shown, and will see clearly that the end year is incorporated correctly.
            Code:
            capture xtset cik file_yr
            
            sort cik file_date res_yr
            list cik file_date res_yr yrbegin yrend res_yr, noobs sepby(cik file_date)
            Code:
              +---------------------------------------------------------+
              |     cik   file_date   res_yr   yrbegin   yrend   res_yr |
              |---------------------------------------------------------|
              |    2178   08mar2006        .      2005    2005        . |
              |---------------------------------------------------------|
              |    2491   03nov2005     2002      2002    2005     2002 |
              |    2491   03nov2005     2003      2002    2005     2003 |
              |    2491   03nov2005     2004      2002    2005     2004 |
              |    2491   03nov2005     2005      2002    2005     2005 |
              |---------------------------------------------------------|
              |    2491   01nov2007     2004      2004    2006     2004 |
              |    2491   01nov2007     2005      2004    2006     2005 |
              |    2491   01nov2007     2006      2004    2006     2006 |
              |---------------------------------------------------------|
              |    3116   02apr2002     2000      2000    2001     2000 |
              |    3116   02apr2002     2001      2000    2001     2001 |
              |---------------------------------------------------------|
              |    3116   31dec2003     2003      2003    2003     2003 |
              |---------------------------------------------------------|
              |    3116   20may2005        .      2005    2005        . |
              |---------------------------------------------------------|
              |    3116   07aug2012     2011      2011    2012     2011 |
              |    3116   07aug2012     2012      2011    2012     2012 |
              |---------------------------------------------------------|
              |    3116   01mar2013     2011      2011    2011     2011 |
              |---------------------------------------------------------|

            Comment


            • #7
              If I run this code:

              use AA_Res2000-2014AccFra.dta,clear
              compress
              format file_date %d
              format res_begin_date %d
              format res_end_date %d
              gen file_yr=year(file_date) // this is the year when restatement is disclosed to the public
              gen yrbegin=year( res_begin_date) // this is the first year restated
              gen yrend=year( res_end_date) // this is the last year restated
              rename company_fkey cik

              destring cik,replace

              expand yrend-yrbegin+1

              sort cik file_yr
              by cik file_yr: generate res_yr = yrbegin+_n-1 if res_accounting | res_fraud // this will generate a res_yr for firms that have res_acc or res_fraud

              capture xtset cik file_yr

              sort cik file_date res_yr
              drop if res_yr<2000 // I restric the sample to firms that start to file in 2000 (I got earlier res_years because those who filed in 2000 would have misreporting years before 2000).

              drop if res_yr==. // this will drop firms that did not restate in a year (they have restated because of clerical errors which I do not include here)

              Then,
              browse if cik=2178


              The data shows that this example firm has two restatements filed in 2003 (one which covers 2001 - 2002) and (another covering 2002-2002) ,
              however the res_yr variable captures years 2001, 2002 and 2004 . I do not understand why 2004 is out there ?

              Comment


              • #8
                Your problem is that you assumed I would understand enough accounting to infer that a firm might have multiple disclosures in a single year, and your original example only showed disclosure years, not the disclosure dates you subsequently revealed that might have suggested the possibility. If you now change
                Code:
                sort cik file_yr
                by cik file_yr: generate res_yr = yrbegin+_n-1 if res_accounting | res_fraud
                to
                Code:
                sort cik file_date
                by cik file_date: generate res_yr = yrbegin+_n-1 if res_accounting | res_fraud
                things should work as you expect, with the warning that this code will fail if a firm has multiple disclosures on a single date.

                You're welcome.

                Comment


                • #9
                  William;
                  You are brilliant (now in green ) .

                  Your code works perfectly. Thanks for that !

                  If I spot something unclear, I will be back. Having said that, it seems to be working very well !

                  Thank for your time! Much appreciated.

                  Comment


                  • #10
                    Hi, I am wondering if I intend to examine subsequent restatements over a three-year period (until the end of 2020), what should I do based on the above code? Thank you so much!

                    Comment

                    Working...
                    X