How to calculate a dummy variable for firms at different quantiles?

Sally Ahmed

Join Date: Dec 2020

Posts: 59
#1

How to calculate a dummy variable for firms at different quantiles?

24 Apr 2022, 11:40

Hello,

I would like to know what is the Stata code if I want to create a dummy variable that takes value of 1 if a firm belongs to the 75th percentile of the X variable and 0 otherwise. and another dummy variable that takes value of 1 if a firm belongs to the 25th percentile of the X variable and 0 otherwise?
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17613

24 Apr 2022, 11:48

Sally:
you can do much better than that in a very easy way:

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

.  xtile quart = price , nq(4)

. tab quart

4 quantiles |
  of price  |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         19       25.68       25.68
          2 |         18       24.32       50.00
          3 |         19       25.68       75.68
          4 |         18       24.32      100.00
------------+-----------------------------------
      Total |         74      100.00

.

Kind regards,
Carlo
(StataNow 18.5)

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35233
#3

24 Apr 2022, 11:53

I can't follow quite what this means. The 25% and 75% percentiles are points -- summary values calculated according whatever percentile rule you're following (and there are several) -- and otherwise by extension intervals that each correspond to just about 1% of the data.

Do you mean something more like being in

1. the first quarter (less than the 25% percentile or lower (first) quartile)

and being in

2. the last quarter (greater than or equal to the 75% percentile or upper (third quartile)?
Comment
Sally Ahmed

Join Date: Dec 2020

Posts: 59
#4

24 Apr 2022, 12:09

Hi Carlo, can you please clarify the use of the xtile?

Hi Nick, yes I mean this I am following this method of measurement from an article and they also mentioned that they also used dummy variables using the 90th and 10th percentiles as robustness and the results remain the same. so if this is the case what is the best code to create these dummy variables?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35233
#5

24 Apr 2022, 12:16

I am not clear that you answered my question.

You added more in mentioning other percentiles.

Possibly what you need is to use summarize if you are referring to those percentiles across the dataset and then generate variables for being greater or less. But if you have say panel data you will want to use egen.
Comment
Sally Ahmed

Join Date: Dec 2020

Posts: 59
#6

24 Apr 2022, 12:28

I am using panel data and yes to your previous question. X is the difference between two variables and to consider the heterogeneity of firms based on
their Y and Z practices, I want to create these dummy variables. So my question if I would like to create two variables one that have the greatest gap representing 75% and those who have lowest gap 25% how to do that?
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17613

24 Apr 2022, 12:44

Sally:
I'm still not clear with what you're after.
That said, I do hope the following toy-example (based on a cross-sectional dataset, though) can be of some help:

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

. xtile quart = price , nq(4)

. ttest price if quart==1 | quart==3, unequal by( quart )

Two-sample t test with unequal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       1 |      19    3907.684    61.31822    267.2799    3778.859    4036.509
       3 |      19    5708.947    100.1486    436.5376    5498.543    5919.352
---------+--------------------------------------------------------------------
Combined |      38    4808.316     158.987    980.0618    4486.177    5130.454
---------+--------------------------------------------------------------------
    diff |           -1801.263    117.4294               -2041.142   -1561.384
------------------------------------------------------------------------------
    diff = mean(1) - mean(3)                                      t = -15.3391
H0: diff = 0                     Satterthwaite's degrees of freedom =  29.8327

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

.

Kind regards,
Carlo
(StataNow 18.5)

Announcement

How to calculate a dummy variable for firms at different quantiles?

Comment

Comment

Comment

Comment

Comment

Comment