Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Percentile and Dummy Variable

    Hello Stata expert,

    As a beginner, I am asking a silly question to the experts. Using the panel data set, I want to run a regression in which the variables, "export intensity and import intensity" will be distributed according to percentile. For the top 25% export and import intensive industry, a dummy/binary variable will be set equal to 1, on the other hand, for the other 75% industries as base value, it will be zero. The outcome we want to find out how the interaction term of this top 25% of export/import intensive industries and exchange rate effect on the employment.

    My question is, first, how I can set percentile for the top 25% of export/import intensive industries. second, How to use the dummy variable 1 for top 25% and zero for others, how to use this interaction with exchange rate into regression.

    Thanks for your response.

    list id qdate employment real_wage exp_index imp_index ex_rate, clean noobs abbreviate(12)

    id qdate employment real_wage exp_index imp_index ex_rate
    1 2010q1 107.50307 136.4281 93 78.6 98.26
    1 2010q2 101.47601 95.30006 100.7 85.7 100.52
    1 2010q3 108.11808 99.49396 97.4 116.8 101.84
    1 2010q4 82.902826 73.23257 108.9 119 99.38
    1 2011q1 128.29028 114.1789 114.5 130.3 92.32
    1 2011q2 119.31119 94.01495 123 139.4 90.24
    1 2011q3 103.81303 73.3571 130.1 166.7 82.57
    1 2011q4 125.21525 97.55968 127.6 184.5 85.61
    1 2012q1 137.63837 120.0093 121.8 155.2 89.53
    1 2012q2 126.32226 115.3941 119.7 145.4 91.02
    1 2012q3 120.2952 95.52112 120.3 143.7 90.72
    1 2012q4 134.19434 107.1768 116.9 116.3 91.89
    1 2013q1 127.42927 152.0428 120.86 125.89 93.30
    1 2013q2 130.2583 141.661 122.72 137.52 89.36
    1 2013q3 115.74415 118.2723 130.79 141.68 85.04
    1 2013q4 128.29028 127.1329 135.07 150.32 83.57
    1 2014q1 127.67527 147.8189 146.85 168.23 80.54
    1 2014q2 130.6273 141.4241 135.61 147.25 86.48
    1 2014q3 125.58425 128.6547 143.26 130.9 85.65
    1 2014q4 130.6273 150.1881 144.12 126.17 87.51
    1 2015q1 127.06027 163.0447 139.31 127.82 86.58
    1 2015q2 125.95325 141.5509 160.93 114.49 82.00
    1 2015q3 119.55719 148.1773 158.5 109.22 76.01
    1 2015q4 119.31119 141.6411 136.29 106.77 81.97
    1 2016q1 109.71709 161.6669 137.22 98.8 84.83
    1 2016q2 115.62115 160.5049 139.23 103.97 84.22
    1 2016q3 106.02706 132.3851 154.06 100.78 83.93
    1 2016q4 102.82902 129.4111 204.77 116.96 77.36
    1 2017q1 132.34932 155.5117 245.9 176.75 75.12
    1 2017q2 134.07134 168.5876 221.03 193.08 76.79
    1 2017q3 127.67527 146.2061 211.08 154.48 75.78
    1 2017q4 117.46617 138.498 225.75 187.91 71.36
    2 2010q1 98.556701 96.35319 98.2 95.8 98.26
    2 2010q2 92.920962 90.09997 99.5 99.8 100.52
    2 2010q3 103.64261 106.5984 99.4 100.3 101.84
    2 2010q4 104.87973 106.6492 102.9 104.1 99.38
    2 2011q1 101.4433 99.17506 116.5 120.3 92.32
    2 2011q2 112.16495 114.3951 121.7 126.5 90.24
    2 2011q3 106.2543 103.714 133.5 142.9 82.57
    2 2011q4 112.85223 110.0785 141.5 144.2 85.61
    2 2012q1 113.53952 107.8803 137.1 136.6 89.53
    2 2012q2 119.31271 111.8697 136.6 142.1 91.02
    2 2012q3 123.98625 120.7126 132.6 139.6 90.72
    2 2012q4 123.43643 119.547 129 143.2 91.89
    2 2013q1 128.24742 139.153 129.99 143.61 93.30
    2 2013q2 130.17182 132.2491 133.89 148.48 89.36
    2 2013q3 126.04811 119.4897 144.1 155.06 85.04
    2 2013q4 120.9622 112.6071 151.08 159.71 83.57
    2 2014q1 135.80756 139.3555 165.5 176.76 80.54
    2 2014q2 127.97251 117.1748 160.49 170.58 86.48
    2 2014q3 132.78351 116.7656 169.59 170.9 85.65
    2 2014q4 136.6323 123.1726 189.23 171.56 87.51
    2 2015q1 132.78351 123.1513 201.48 178.76 86.58
    2 2015q2 140.75601 134.5694 215.16 194.23 82.00
    2 2015q3 137.31959 124.2575 221.29 202.24 76.01
    2 2015q4 138.83162 135.3456 212.15 203.57 81.97
    2 2016q1 134.98282 144.5872 209.46 199.2 84.83
    2 2016q2 140.89347 156.6378 201.55 199.98 84.22
    2 2016q3 137.73196 145.2637 203.42 207.11 83.93
    2 2016q4 138.5567 149.6884 221.05 225.42 77.36
    2 2017q1 135.80756 145.7325 240.13 251.69 75.12
    2 2017q2 143.78007 156.4474 232.14 242.87 76.79
    2 2017q3 139.93127 149.7617 229.09 238.34 75.78
    2 2017q4 145.2921 158.9064 240.35 258.19 71.36
    3 2010q1 99.667772 94.51541 99.7 103.6 98.26
    3 2010q2 101.47991 113.5809 101.7 106.7 100.52
    3 2010q3 105.22501 101.2129 100.2 95.3 101.84
    3 2010q4 93.627301 91.00129 98.4 94.4 99.38
    3 2011q1 106.31229 121.101 106.8 101.2 92.32
    3 2011q2 106.31229 100.3025 110 115.5 90.24
    3 2011q3 101.60072 101.8992 118.9 126.9 82.57
    3 2011q4 113.56086 99.08925 126 140.2 85.61
    3 2012q1 110.66143 135.6254 125.5 120.7 89.53
    3 2012q2 104.13772 102.1717 128.8 120.2 91.02
    3 2012q3 116.7019 106.4782 132.9 124.7 90.72
    3 2012q4 111.02386 109.3083 133.7 140.9 91.89
    3 2013q1 111.6279 110.28 132.67 148.96 93.30
    3 2013q2 126.60827 121.8933 132.45 147 89.36
    3 2013q3 120.68861 119.9877 145.71 152.33 85.04
    3 2013q4 116.33947 136.3171 153.57 143.05 83.57
    3 2014q1 122.74237 133.5822 160.16 174.87 80.54
    3 2014q2 116.58109 123.0074 156.46 173.4 86.48
    3 2014q3 114.1649 125.8661 163.08 176.49 85.65
    3 2014q4 114.52733 147.3899 167.67 169.36 87.51
    3 2015q1 127.21232 144.7127 173.48 170.63 86.58
    3 2015q2 114.28571 134.0307 182.49 180.89 82.00
    3 2015q3 121.7759 136.1627 187.13 188.62 76.01
    3 2015q4 119.96375 148.9171 191.41 189.74 81.97
    3 2016q1 125.40018 147.2465 192.79 175.47 84.83
    3 2016q2 122.50075 174.4344 193.86 170.31 84.22
    3 2016q3 119.84295 143.2341 195.57 179.05 83.93
    3 2016q4 130.8366 161.4556 217.58 195.77 77.36
    3 2017q1 135.54817 166.4919 241.21 208.05 75.12
    3 2017q2 127.93718 162.9369 231.26 188.45 76.79
    3 2017q3 128.29961 154.1123 224.44 181.49 75.78
    3 2017q4 121.7759 143.4018 244.2 203.22 71.36
    4 2010q1 145.22508 151.3518 99.7 101.8 98.26
    4 2010q2 87.224796 87.11636 102 100.7 100.52
    4 2010q3 84.027489 81.43513 100 99 101.84
    4 2010q4 83.52265 79.78746 98.3 98.5 99.38
    4 2011q1 80.99846 82.98003 106.5 107.7 92.32
    4 2011q2 80.717995 81.2909 106.8 112 90.24
    4 2011q3 81.839857 95.0141 120.9 127.4 82.57
    4 2011q4 68.433602 77.33405 133.7 130.4 85.61
    4 2012q1 53.288461 61.24731 127.5 126.7 89.53
    4 2012q2 53.232368 77.04062 127.9 125.6 91.02
    4 2012q3 52.447064 60.91924 139.3 124.6 90.72
    4 2012q4 54.690789 65.99006 122.2 125.4 91.89
    4 2013q1 52.390971 63.3693 128.35 126.78 93.30
    4 2013q2 51.998319 69.38093 130.66 133.9 89.36
    4 2013q3 53.007995 63.99219 138.27 145.05 85.04
    4 2013q4 50.876457 60.1172 139.26 151.21 83.57
    4 2014q1 50.708177 79.29106 150.23 162.1 80.54
    4 2014q2 53.45674 62.58109 149.69 161.61 86.48
    4 2014q3 54.129857 61.79483 153.7 162.05 85.65
    4 2014q4 49.922874 71.61903 158.42 164.98 87.51
    4 2015q1 56.429675 77.67636 161.86 177.39 86.58
    4 2015q2 53.737206 80.50221 175.89 189.72 82.00
    4 2015q3 57.102793 68.60752 182.35 197.76 76.01
    4 2015q4 57.158886 67.91119 182.53 199.11 81.97
    4 2016q1 59.963542 70.41762 199.68 207.79 84.83
    --more--

  • #2
    I don't get from this whether you want to calculate percentiles separately for each date, but either way

    Code:
    egen p75 = pctile(exp_index), p(75)
    gen wanted = exp_index > p75
    is a start on what you ask for. Personally, I dislike drastic dichotomies, dreading dire downgrading of desirable data. Why not just use export index, or a transform of it, as a predictor?

    Comment


    • #3
      Code:
      foreach v of varlist *_index {
       centile `v', centile(75)
          gen byte top_25_pct_`v' = `v' > `r(c_1)' if !missing(`v')
      }
      As these variables are dichotomous, you use them in regressions by prefixing them with I. For example:
      Code:
      regress real_wage ex_rate i.top_25_pct_imp_index i.top_25_pct_exp_index
      In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Added: Crossed with #2 which offers another way of doing it. I should add that I strongly agree with Nick's concerns about the wisdom of this approach in the first place. Converting continuous variables into dichotomies degrades reliability and thows away information, all the more so when the cutpoint for the dichotomy is farther from the mean. It's usually better to use the original continuous variable, or a continuous transform thereof.
      Last edited by Clyde Schechter; 02 Mar 2020, 12:06.

      Comment


      • #4
        Thanks Clyde Schechter; The code is perfectly working




        Comment

        Working...
        X