Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Roy Student: We ask for full real names here. Please read and act on http://www.statalist.org/forums/help#realnames

    Looking at your questions: It seems that a string variable costat has the value A if active and I if inactive. You don't give other variable names, but something equivalent to

    Code:
     
    bysort firmid : egen OK = total(costat == "A" & year == 1993)
    will count how many observations for each firm show that the firm was active in 1993. You want to drop firms with values of 0 or equivalently keep those with values of 1.


    Missing values of total assets. You can count how many you have for each firm with something like

    Code:
    by firmid: egen nmissing = total(missing(total_assets))
    but there isn't a single best approach to missings on which all researchers agree. Three approaches among many others are (1) to keep only panels on which no value is missing (2) to use some threshold (e.g. that you want 5 years or more with non-missing values) (3) that you just drop observations with missing values. There are arguments for and against all of those.

    Comment


    • #17
      Thank you for you reply Nick, the problem now is that I am left only with firms that show "active" for all years. It should be the case that the dataset also shows firms that are for instance inactive in 2005 but were active in 1993.(Since I am replicating a research and this is what they described I think there will be such firms.)

      Comment


      • #18
        I can't see a reproducible example of your problem, with code and sample data. Please read http://www.statalist.org/forums/help#stata as well as http://www.statalist.org/forums/help#realnames


        Here is a minimal example as proof of concept:


        Code:
        . clear
        
        . input firm_id year str1 costat
        
               firm_id       year     costat
          1. 1  1993 "A"
          2. 1  2015 "I"
          3. 2  1993 "I"
          4. 2  2015 "A"
          5. end
        
        . bysort firm_id : egen OK = total(costat == "A" & year == 1993)
        
        . list, sepby(firm_id)
        
             +------------------------------+
             | firm_id   year   costat   OK |
             |------------------------------|
          1. |       1   1993        A    1 |
          2. |       1   2015        I    1 |
             |------------------------------|
          3. |       2   1993        I    0 |
          4. |       2   2015        A    0 |
             +------------------------------+
        
        . keep if OK
        (2 observations deleted)
        
        . list, sepby(firm_id)
        
             +------------------------------+
             | firm_id   year   costat   OK |
             |------------------------------|
          1. |       1   1993        A    1 |
          2. |       1   2015        I    1 |
             +------------------------------+
        The example shows that one firm is kept because they were active in 1993 and the other firm is dropped. Nothing is said in the code about any other year, so nothing else bites.

        Here is the code all in one so that you try it for yourself (copy and paste into a Do-file editor window).

        Code:
        clear
        input firm_id year str1 costat
        1  1993 "A"
        1  2015 "I"
        2  1993 "I"
        2  2015 "A"
        end
        bysort firm_id : egen OK = total(costat == "A" & year == 1993)
        list, sepby(firm_id)
        keep if OK
        list, sepby(firm_id)

        So what is the explanation for your perception? Perhaps the code you applied was different. Perhaps the dataset is not as you think. Perhaps something that you did is crucial. I can't say which. We need more information on that.

        Comment


        • #19
          "Roy Student" didn't reply. But "MPolo1990" asked essentially the same question at http://stackoverflow.com/questions/3...articular-year

          Comment


          • #20
            @assets: keep if assets!=. // keeps only the non missing values
            @if alive once: gen alive=cond(var>0 & year>=YY,1,0) // generates an indicator for each obs if the firm is alive after a certain year
            bysort firm: gen nalive=sum(alive) // generates the number of years alive
            keep if nalive>1

            Comment


            • #21
              Sure:

              Code:
              keep if totalassets !=.
              for the second question: did you use xtset or tset?

              Comment


              • #22
                Hi, I've got a similar problem to Janine.

                I've got a variable representing 24h-time in str5 format, displayed as:
                HH:MM
                12:34
                23:19
                0:11
                ...

                I would like to drop/keep observations between 08:00 to 20:00 and vice versa.

                Is there an elegant way to write the code?

                Thanks!

                Comment


                • #23
                  Here's keep.

                  .ÿversionÿ15.1

                  .ÿ
                  .ÿclearÿ*

                  .ÿ
                  .ÿinputÿstr6ÿthen

                  ÿÿÿÿÿÿÿÿÿÿthen
                  ÿÿ1.ÿ"12:34"
                  ÿÿ2.ÿ"23:19"
                  ÿÿ3.ÿ"0:11"
                  ÿÿ4.ÿend

                  .ÿ
                  .ÿkeepÿifÿinrange(clock(then,ÿ"hm"),ÿclock("08:00",ÿ"hm"),ÿclock("20:00",ÿ"hm"))
                  (2ÿobservationsÿdeleted)

                  .ÿ
                  .ÿlist,ÿnoobs

                  ÿÿ+-------+
                  ÿÿ|ÿÿthenÿ|
                  ÿÿ|-------|
                  ÿÿ|ÿ12:34ÿ|
                  ÿÿ+-------+

                  .ÿ
                  .ÿexit

                  endÿofÿdo-file


                  .

                  Comment


                  • #24
                    great thank you!

                    if i were to tabulate another independent variable based on the above condition:

                    tab var1 if inrange(clock(then, "hm"), clock("08:00", "hm"), clock("20:00, "hm"))

                    I will get the error "too few quotes"

                    is there something I'm doing wrong?

                    many thanks in advance!

                    Comment


                    • #25
                      Originally posted by Mesut Ozil View Post
                      is there something I'm doing wrong?
                      Code:
                      tabulate var1 if inrange(clock(then, "hm"), clock("08:00", "hm"), clock("20:00", "hm"))

                      Comment


                      • #26
                        oh my, what an amateur mistake. thank you!

                        Comment

                        Working...
                        X