Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with removing data

    Hi all,

    I'm using Stata 17 on MacBook.
    For my Master Thesis I have a panel data set where I have accounting data for the years 2016-2020 for US firms.
    However, after dropping missing values, some firms end up not having data for the years 2016-2020 but for instance only for the years 2019 and 2020.
    Below is a screenshot of a part of my data where a firm that only has data for the years 2019 and 2020 is highlighted. How do I drop the firms that do not have data for all the years (2016-2020) that I need?

    Click image for larger version

Name:	Screenshot 2022-05-01 at 18.08.13.png
Views:	1
Size:	136.4 KB
ID:	1662486


    Thanks in advance!



  • #2
    15 posts in and you are still posting screenshots. Please review the FAQ Advice to see how to present data examples in the forum.

    Code:
    *Ensure no duplicates
    isid cusip fyear
    
    *Check no years outside 2016-2020
    assert inrange(fyear, 2016, 2020)
    
    *Wanted
    bys cusip: keep if _N== 5
    Last edited by Andrew Musau; 01 May 2022, 11:53.

    Comment


    • #3
      Hi Andrew,

      Thank you very much!
      I posted a screenshot because It didn't work when I tried to post a small sample of my dataset as described in the FAQ.

      When I use your code it only keeps the firms with data for years 2016-2020.
      However, the years are no longer in order from 2016 to 2020 but mixed up (e.g. the order 2017,2019,2020,2016) for each company.
      I tried to use the command 'egen' to fix this problem, but it doesn't work.

      Do you maybe know how I can get it back in the order from 2016 to 2020?


      Comment


      • #4
        Originally posted by Qi vd Kolk View Post
        I posted a screenshot because It didn't work when I tried to post a small sample of my dataset as described in the FAQ.
        You can restrict the data. See

        Code:
        help dataex
        In your example:

        Code:
        dataex datadate-cusip in 115/140
        will provide a data example of your screenshot in #1.


        Do you maybe know how I can get it back in the order from 2016 to 2020?
        Code:
        sort cusip fyear
        See

        Code:
        help sort

        Comment


        • #5
          Thank you very much Andrew, both codes work!

          Comment

          Working...
          X