Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help on Stata Data cleaning and analysis

    I have a data set with company names and their daily returns value. I want to calculate the information share month-wise for which I first need to clean the data. Following are the issues I am facing and need help with:
    1) I want to drop the company from the dataset if it has a missing return value for even a single day. Example: for the following table, since a, d has missing ret, a and d will both be dropped from the data set completely(even the ones with ret values)
    Date Company names ret
    1st Jan 2020 a 1
    1st Jan 2020 b 2
    1st Jan 2020 c 3
    1st Jan 2020 d
    2nd Jan 2020 a
    2nd Jan 2020 b 1
    2nd Jan 2020 c 2
    2nd Jan 2020 d 3
    3rd Jan 2020 a 1
    3rd Jan 2020 b 2
    3rd Jan 2020 c 3
    3rd Jan 2020 d 1
    4th Jan 2020 a
    4th Jan 2020 b 1
    4th Jan 2020 c 2
    5th Jan 2020 a 3
    5th Jan 2020 b 1
    5th Jan 2020 c 2
    6th Jan 2020 d 3
    6th Jan 2020 a 3
    2) I want to calculated the information share if company c(suppose) has 19 returns in jan'2020 and 19 returns in the subsequent month (feb'2020) i.e. equal number of returns . I was planning to use a dummy variable to differentiate . like dummy variable =0 for equal count of ret for jan and feb and 1 otherwise. How to go about this? Any alternate suggestions for this.

    P.S: I am new to Stata and appreciate your help. Thanks a lot

  • #2
    1) is

    Code:
    bysort company (ret) : drop if missing(ret[_N])
    I don’t follow what you want otherwise.

    Comment


    • #3
      Thank You for the response.

      for 2) I want to create a dummy variable for Jan returns. The dummy variable will be 0 if the number of returns for a company x is equal to the number of returns of the same company in the subsequent month and 1 otherwise. Hope I am clear with the query.
      PS: I have already created a t variable for the time series such that Jan 2020 is grouped as 1, Feb 2020 a as 2 etc.
      Thank you again.

      Comment


      • #4
        Hi Nick Cox , With reference to the code you provide, what will it look like if I want to run this code for each value of t (my timeseries variable)?

        Comment


        • #5
          #4 That would just be equivalent to

          Code:
          drop if missing(ret)
          if I understand correctly.

          (I still don't understand #3.)

          Comment

          Working...
          X