Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Executing commands even if global variable is empty

    Hello,


    I am making flag variables to detect irregular changes in capital & investment variables in the panel dataset.
    Here, one of the main point is to flag capital & investment variables from (i) own yoy changes (either sudden increase/decrease) and (ii) comovement between capital and investment variables.
    Underlying logic is that capital stocks (tangible and intangible) evolves according to past investment (tangible and intangible, respectively).
    For example, if there is a sudden increase (above certain threshold) in capital stock at t, but if we observe that there was an increase in investment at t-1, we do not flag them.
    In other words, all the increase/decrease in capital stock at t should be justified by increase/decrease in investment at t-1.

    Example of the dataset would be like:
    id year capital_tan capital_intan investment_tan investment_intan
    1 2020 0 0 0 90
    1 2021 0 200 0 0
    1 2022 0 299 0 30
    1 2023 200 30000 0 600
    1 2024 50000 . 300 0
    1 2025 0 0 5000 .
    2 2011 40 4555 . .
    2 2012 . 4555 . .
    2 2013 . 46666 3333 555555
    2 2014 600 0 55555 555555
    3 2009 34 1345 22 53
    3 2010 10000 1355 3523 5235
    3 2011 . 1555 . .



    Therefore, I used the below code:

    * Setup
    tsset $id $yr, yearly
    bysort $id ($yr): gen byte first_year = (_n == 1)


    * Generate log differences (excluding first year)
    foreach var in $vars_L $vars_K $vars_I $vars_output $vars_ratio {
    gen log_diff_`var' = .
    replace log_diff_`var' = log(`var'/L1.`var') ///
    if L1.`var' > 0 & `var' > 0 ///
    & first_year == 0
    }

    * Section (i) - define global var
    gl Ktan capital_tan
    gl Kintan capital_intan
    gl Itan invest_tan
    gl Iintan invest_intan
    gl vars_K capital_tan capital_intan
    gl vars_I invest_tan invest_intan

    * Section (ii) - Define threshold
    loc threshold_K 0.69
    loc threshold_I 0.69
    loc com_threshold_K 0.3

    * Section (iii) - Flags for K and I variables
    local nvars : word count $vars_K

    forvalues i = 1/`nvars' {

    loc K_var : word `i' of $vars_K
    loc I_var : word `i' of $vars_I

    gen log_diff_L1_`I_var' = .
    replace log_diff_L1_`I_var' = log(L1.`I_var'/L2.`I_var') ///
    if L1.`I_var' > 0 & L2.`I_var' > 0

    gen flag_jump_`K_var' = 0
    gen flag_jump_`I_var' = 0

    // Startup jumps
    replace flag_jump_`K_var' = 1 ///
    if first_year == 0 ///
    & `K_var' > 0 ///
    & (L1.`K_var' == 0 | L1.`K_var' == .) ///
    & (L2.`K_var' == 0 | L2.`K_var' == .)
    replace flag_jump_`I_var' = 1 ///
    if first_year == 0 ///
    & `I_var' > 0 ///
    & (L1.`I_var' == 0 | L1.`I_var' == .) ///
    & (L2.`I_var' == 0 | L2.`I_var' == .)

    // Drops to zero
    replace flag_jump_`K_var' = 1 ///
    if first_year == 0 ///
    & `K_var' == 0 ///
    & L1.`K_var' > 0
    replace flag_jump_`I_var' = 1 ///
    if first_year == 0 ///
    & `I_var' == 0 ///
    & L1.`I_var' > 0

    // Positive-to-positive jumps with lagged comovement
    replace flag_jump_`K_var' = 1 ///
    if first_year == 0 ///
    & log_diff_`K_var' != . ///
    & abs(log_diff_`K_var') > `threshold_K' ///
    & (log_diff_L1_`I_var' == . ///
    | abs(log_diff_`K_var' - log_diff_L1_`I_var') > `com_threshold_K')

    replace flag_jump_`I_var' = 1 ///
    if first_year == 0 ///
    & log_diff_`I_var' != . ///
    & abs(log_diff_`I_var') > `threshold_I'
    }

    foreach var of varlist flag_jump_* {
    replace `var' = 0 if missing(`var')
    }



    But, now I want to make commands using if statement to account for the cases where there is no either capital or investment variables in the dataset.
    So if there exists both capital&investment variables (intangible capital&intangible investment or tangible capital&tangible investment), then the code should generate flags with comovement.
    However, if either K or I variables are missing in the dataset (not missing observation but the variable per se is not available in the dataset) - when we leave global variable empty in section (i), we generate flags based on their (tangible, intangible K and I) own yoy changes above threshold.

    Could someone please help me with this?
    I tried to do it but everytime I couldn't avoid errors :/

    Thank you in advance!

    AC









    Last edited by Anne-Claire Jo; 25 Mar 2025, 11:21.

  • #2
    This is really hard for me to follow. An easy thing to do and a great help to readers is to use CODE delimiters to make the code more readable.

    Then when I tried to edit your data example to convert it to code, I discovered that it's an image, so that's no go until and unless I type in the data myself, which (sorry) I won't do.

    Still, one comment is possible. You're referring, if I understand this correctly, to global macros that have not yet been defined. That's not fatal in itself, but I can't see that the code would get past

    Code:
    tsset $id $yr, yearly
    as it would evaluate to

    Code:
    tsset, yearly
    which is illegal.

    Turn and turn about, if that is a misinterpretation of your code, please give us a self-contained reproducible example with sample data, code that does run some way, and a clear question.

    https://www.statalist.org/forums/help#stata should help.

    Comment


    • #3
      Well, you already have a local macro nvars which tells you whether you have any K-vars or not. You need to make a similar local macro for the I-vars. And actually, it would be best to use names that reflect what they are. So instead of nvars, I would create:
      Code:
      local n_K_vars: word count $vars_K
      local n_I_vars: word count $vars_I
      This also entails changing -forvalues i = 1/`nvars'- to -forvalues i = 1/`n_K_vars'

      At the point where you want your code to branch based on whether or not the K or I vars do not exist:
      Code:
      if inlist(0, `n_K_vars', `n_I_vars') {
          // HERE PLACE ALL CODE THAT SHOULD RUN WHEN EITHER K OR I VARS DO NOT EXIST
      }
      else {
          // HERE PLACE ALL CODE THAT SHOULD RUN WHEN BOTH K AND I VARS EXIST
      }
      Notice, crucially, that this is done with -if- commands that guard an entire block of code wrapped in curly braces, not with -if- qualifiers attached to each command. This is an important distinction in Stata. -if- qualifiers attached to individual commands determine which subset of the data the command will apply to. -if- commands evaluate a logical expression and, depending on whether it is true or false, the entire block of commands is either executed or is skipped. See Cox NJ, Schechter CB. (2023) Stata tip 152: if and if: When to use the if qualifier and when to use the if command. Stata Journal 23(2):589-594. https://www.stata-journal.com/articl...article=st0721 for a fuller explanation.

      Added: Crossed with #2 which deals with other issues in the code. To be honest, I didn't really attempt to understand the lengthy code that was given. Instead, I simply responded generically to the opening statement: "But, now I want to make commands using if statement to account for the cases where there is no either capital or investment variables in the dataset." It is possible I have missed the point.
      Last edited by Clyde Schechter; 25 Mar 2025, 11:59.

      Comment


      • #4
        Sorry for making it complicated!

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input long ID int year long(capital_tan capital_intan invest_tan invest_intan)
        1 2011   17045     0    1566     0
        1 2012   62261     0       0     0
        1 2013  607367     0       0     0
        1 2014 1158161     0  550794     0
        1 2015 1059901     0       0     0
        1 2016 1665408     0  500207     0
        1 2017 2726397     0 1614000     0
        1 2018  185213     0   16761     0
        1 2019 4671962     0 2105818     0
        1 2020  328650     0   62791     0
        1 2021       0   153       0     0
        2 2002       0     0       0     0
        2 2003       0     0       0     0
        3 2010       0     0       0     0
        3 2011     629     0     629     0
        3 2012     441     0       0     0
        4 2012       0     0       0     0
        4 2013       0     0       0     0
        4 2014       0     0       0     0
        4 2015     794    25      83    28
        4 2016     180     0       0     0
        5 2020       0     0       0     0
        6 2014       0     0       0     0
        6 2015       0     0       0     0
        6 2016       0     0       0     0
        6 2017       0     0       0     0
        6 2018       0     0       0     0
        7 2005    9949  2274    9950  3956
        7 2006   20227  2910   13262  2011
        7 2007   16116  2462       0   433
        7 2008   36979  1508   27667     0
        7 2009   30855  1300    1852   695
        end
        Code:
        * Setup
        tsset $id $yr, yearly
        bysort $id ($yr): gen byte first_year = (_n == 1)
        
        
        * Generate log differences (excluding first year)
        foreach var in $vars_L $vars_K $vars_I $vars_output $vars_ratio {
        gen log_diff_`var' = .
        replace log_diff_`var' = log(`var'/L1.`var') ///
        if L1.`var' > 0 & `var' > 0 ///
        & first_year == 0
        }
        
        * Section (i) - define global var
        gl Ktan capital_tan
        gl Kintan capital_intan
        gl Itan invest_tan
        gl Iintan invest_intan
        gl vars_K capital_tan capital_intan
        gl vars_I invest_tan invest_intan
        
        * Section (ii) - Define threshold
        loc threshold_K 0.69
        loc threshold_I 0.69
        loc com_threshold_K 0.3
        
        * Section (iii) - Flags for K and I variables
        local nvars : word count $vars_K
        
        forvalues i = 1/`nvars' {
        
        loc K_var : word `i' of $vars_K
        loc I_var : word `i' of $vars_I
        
        gen log_diff_L1_`I_var' = .
        replace log_diff_L1_`I_var' = log(L1.`I_var'/L2.`I_var') ///
        if L1.`I_var' > 0 & L2.`I_var' > 0
        
        gen flag_jump_`K_var' = 0
        gen flag_jump_`I_var' = 0
        
        // Startup jumps
        replace flag_jump_`K_var' = 1 ///
        if first_year == 0 ///
        & `K_var' > 0 ///
        & (L1.`K_var' == 0 | L1.`K_var' == .) ///
        & (L2.`K_var' == 0 | L2.`K_var' == .)
        replace flag_jump_`I_var' = 1 ///
        if first_year == 0 ///
        & `I_var' > 0 ///
        & (L1.`I_var' == 0 | L1.`I_var' == .) ///
        & (L2.`I_var' == 0 | L2.`I_var' == .)
        
        // Drops to zero
        replace flag_jump_`K_var' = 1 ///
        if first_year == 0 ///
        & `K_var' == 0 ///
        & L1.`K_var' > 0
        replace flag_jump_`I_var' = 1 ///
        if first_year == 0 ///
        & `I_var' == 0 ///
        & L1.`I_var' > 0
        
        // Positive-to-positive jumps with lagged comovement
        replace flag_jump_`K_var' = 1 ///
        if first_year == 0 ///
        & log_diff_`K_var' != . ///
        & abs(log_diff_`K_var') > `threshold_K' ///
        & (log_diff_L1_`I_var' == . ///
        | abs(log_diff_`K_var' - log_diff_L1_`I_var') > `com_threshold_K')
        
        replace flag_jump_`I_var' = 1 ///
        if first_year == 0 ///
        & log_diff_`I_var' != . ///
        & abs(log_diff_`I_var') > `threshold_I'
        }
        
        foreach var of varlist flag_jump_* {
        replace `var' = 0 if missing(`var')
        }
        And for more explanation, if i say I put:
        gl Ktan
        gl Kintan capital_intan
        gl Itan invest_tan
        gl Iintan invest_intan

        where Ktan is empty, the code should automatically generate flag variable for Ktan by taking into account its own y-o-y increase/decrease.
        But for Kintan, we do have both Kintan and Itan, therefore the code will execute the above code (flagging with comovement).

        The dataset I put here is just an example but as I use this code to be compatible to any dataset, I should anticipate for missing variables in the dataset as well.
        I hope that this explanation is clearer.

        Thanks again!

        Comment


        • #5
          Sorry, but #4 falls at the same hurdle as did #1.

          The globals $id $yr are undefined when used in the code shown. Perhaps in your real code there is some earlier definition, but that is no help to me.

          Someone else may be able to follow what you are seeking here.

          The only slightly overlapping reply from Clyde Schechter picks up another issue, which I could sense to be looming (cf. Clyde's reference to a joint paper of ours in #3), but my approach was to stop at the first code chunk I couldn't understand.

          What is going on here? The large chunks of code that look confused suggest various hypotheses, which are not contradictory.

          1. You know quite a lot of Stata but you're coding at speed and then asking for help.

          2. This is, or was, someone else's code and you're trying to hack at it.

          3. Some copilot is being used to suggest code and you're now trying to get it to work.

          None of these is out of order, naturally, but the thread so far is fairly frustrating.

          Comment


          • #6
            Sorry again, Nick Cox .
            It's true that I defined global variables before in the code:
            Code:
            gl id                ID
            gl yr                year
            gl ind                ind_2digit
            gl Ktan                capital_tan
            gl Kintan            capital_intan
            gl Itan                invest_tan
            gl Iintan            invest_intan
            gl vars_K             capital_tan capital_intan
            gl vars_I             invest_tan invest_intan
            The data example I showed is really the "example" of the several data I would be dealing with.
            So indeed since all the data I will use (besides this one) would contain different variables perse or different names/forms of same variables (e.g. K intangible could be registered under the name of K_in or K_inta or capital_in, something like this).
            That is why I should make commands to be compatible and generalized to be used in other types of firm-level dataset.
            I hope this clarifies the point that you mentioned.



            And Clyde Schechter thanks for your reply!
            Actually, I am still struggling to understand how it works..

            maybe if I show my previous codes that are relevant:

            Code:
            * Variables
            gl id                ID
            gl yr                year
            gl ind                ind_2digit
            gl Ktan                capital_tan
            gl Kintan            capital_intan
            gl Itan                invest_tan
            gl Iintan            invest_intan
            gl vars_K             capital_tan capital_intan
            gl vars_I             invest_tan invest_intan
            
            * Setting thresholds to detect jumps - log values
            loc threshold_K             0.69     
            loc threshold_I             0.69          
            loc com_threshold_K         0.3
            
            * Setup
            tsset $id $yr, yearly
            bysort $id ($yr): gen byte first_year = (_n == 1)
            
            
            * Generate log differences (excluding first year)
            foreach var in $vars_L $vars_K $vars_I $vars_output $vars_ratio {
                gen log_diff_`var' = .
                replace log_diff_`var' = log(`var'/L1.`var') ///
                    if L1.`var' > 0 & `var' > 0 ///
                    & first_year == 0
            }
            
            * -COMOVEMENT- Capital and investment flags
            local nvars : word count $vars_K
            
            forvalues i = 1/`nvars' {
                
                loc K_var : word `i' of $vars_K
                loc I_var : word `i' of $vars_I
            
                gen log_diff_L1_`I_var' = .
                replace log_diff_L1_`I_var' = log(L1.`I_var'/L2.`I_var') ///
                    if L1.`I_var' > 0 & L2.`I_var' > 0
            
                gen flag_jump_`K_var' = 0
                gen flag_jump_`I_var' = 0
            
                // Startup jumps
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & `K_var' > 0 ///
                    & (L1.`K_var' == 0 | L1.`K_var' == .) ///
                    & (L2.`K_var' == 0 | L2.`K_var' == .)
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & `I_var' > 0 ///
                    & (L1.`I_var' == 0 | L1.`I_var' == .) ///
                    & (L2.`I_var' == 0 | L2.`I_var' == .)
            
                // Drops to zero
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & `K_var' == 0 ///
                    & L1.`K_var' > 0
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & `I_var' == 0 ///
                    & L1.`I_var' > 0
            
                // Positive-to-positive jumps with lagged comovement
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & log_diff_`K_var' != . ///
                    & abs(log_diff_`K_var') > `threshold_K' ///
                    & (log_diff_L1_`I_var' == . ///
                    | abs(log_diff_`K_var' - log_diff_L1_`I_var') > `com_threshold_K')
            
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & log_diff_`I_var' != . ///
                    & abs(log_diff_`I_var') > `threshold_I'
            }
            
            foreach var of varlist flag_jump_* {
                replace `var' = 0 if missing(`var')
            }
            And for more clarification,
            I want to use

            Code:
            * -COMOVEMENT- Capital and investment flags
            local nvars : word count $vars_K
            
            forvalues i = 1/`nvars' {
                
                loc K_var : word `i' of $vars_K
                loc I_var : word `i' of $vars_I
            
                gen log_diff_L1_`I_var' = .
                replace log_diff_L1_`I_var' = log(L1.`I_var'/L2.`I_var') ///
                    if L1.`I_var' > 0 & L2.`I_var' > 0
            
                gen flag_jump_`K_var' = 0
                gen flag_jump_`I_var' = 0
            
                // Startup jumps
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & `K_var' > 0 ///
                    & (L1.`K_var' == 0 | L1.`K_var' == .) ///
                    & (L2.`K_var' == 0 | L2.`K_var' == .)
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & `I_var' > 0 ///
                    & (L1.`I_var' == 0 | L1.`I_var' == .) ///
                    & (L2.`I_var' == 0 | L2.`I_var' == .)
            
                // Drops to zero
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & `K_var' == 0 ///
                    & L1.`K_var' > 0
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & `I_var' == 0 ///
                    & L1.`I_var' > 0
            
                // Positive-to-positive jumps with lagged comovement
                replace flag_jump_`K_var' = 1 ///
                    if first_year == 0 ///
                    & log_diff_`K_var' != . ///
                    & abs(log_diff_`K_var') > `threshold_K' ///
                    & (log_diff_L1_`I_var' == . ///
                    | abs(log_diff_`K_var' - log_diff_L1_`I_var') > `com_threshold_K')
            
                replace flag_jump_`I_var' = 1 ///
                    if first_year == 0 ///
                    & log_diff_`I_var' != . ///
                    & abs(log_diff_`I_var') > `threshold_I'
            }
            
            foreach var of varlist flag_jump_* {
                replace `var' = 0 if missing(`var')
            }
            for the - else - case when all the pairs of capital and investment variables of either intangible or tangible are non-missing in the dataset (e.g. if the data contains this variables, because I might use this code in the future when I would not have these variables).

            Comment


            • #7
              I'm not clear exactly what code you want to run if the capital and/or investment variables are unavailable. Can you show us that?

              Comment


              • #8
                Hemanshu Kumar Thanks a lot for your reply and sorry for not being clear.

                To explain the context, I would like to make codes that flags when we detect irregular jumps in the data.
                But I should make code while taking into account that some data (besides the one that I am currently using as just an example) do not contain some variables that should be used to execute the command. And this is the reason that I should use a lot of global/local to make adjustable in the future.
                So now, what I try to do is: flag jumps of capital variables (tangible & intangible) and investment variables (tangible & intangible).
                Given that capital stock (both tangible & intangible) depends on lagged investment variables - tangible capital to tangible investment; and intangible capital to intangible investment - I would like to make flag variables that observe the comovement between the two.
                In other words, if I observe a sudden increase/decrease at time t for tangible capital variable (relative to t-1), but if I see that at t-1, there was an increase/decrease (respectively) in tangible investment variable, then it should not be flagged given that this jump was reasonable.
                However, the problem is that sometimes there are some data (that I will use in the future) that do not contain the variables that are necessary to execute these commands. For instance, I can have tangible capital variable, but maybe there won't be tangible investment variable etc. To completely execute the commands, I should normally have pair of (K tangible var; I tangiblevar) and (K intangible var; I intangible var). So in the case where one of the two variables (or both sometimes), I should make commands that do not account for comovement, but rather its own jumps above the threshold. For example, if K tangible doesn't exist and I tangible exists, then we generate flag variable for I tangible variable with the observation of t-1 to t jumps. This logic applies to other variables of K and I.
                In addition, for the jump, since looking only t-1 and t jump might be too short, so I should flag for variable at t while observing t-1 and t+1. That being said, if K tangible is 1000 at t-1, 500000 at t, and 1200 at t+1, then it should be flagged because there was jump at time t (but if we look only t-1 and t+1 there was no jump).
                I completely understand this underlying logic is very complicated.. But this is the reason why I am having trouble implementing this in Stata..
                Could someone help me with this, please?



                Comment


                • #9
                  Someone could plz help me with this problem??\

                  Comment


                  • #10
                    I think you'll need to put more structure on what all types of missingness of variables you might expect. There are four variables, in $Ktan, $Kintan, $Itan, and $Iintan. What combinations of these (2x2x2x2 = 16 possibilities) might be missing? For each of those combinations of missingness, what exactly would you like to do?

                    Comment


                    • #11
                      Hemanshu Kumar so there are 2 pairs of K&I to compute flag variable with comovement - (Ktan, Itan) and (Kintan, Iintan).
                      If both of them in the pair are non-missing, then we can generate a flag variable that accounts for comovement (as in the code, but now it will be [t-1; t+1] for flag in t).
                      However, if one of them (K or I, in either tangible or intangible) is missing, then I want to generate flag variable that only accounts for jumps with irregular patterns in t while looking at [t-1; t+1].
                      For example, if K intan is missing and I intan is non-missing, then the code should generate only the flag jump for I intan with something like (I'm not sure if the below code makes sense though..):
                      Code:
                      gen d_`var' = abs(ln(F.`var') - ln(L.`var'))
                      gen flag_jump_`var' = d_`var' > `threshold_output'
                      But if K intan and I intan are both non-missing, then the code should generate flag variable with comovement (as example in #6, but it should be fixed as well..)

                      Comment


                      • #12
                        Hi Anne-Claire Jo

                        Here is an idea of how to structure your code. I have used components of your code in #6 and #11; you should change those as needed.

                        Code:
                        * Variables
                        gl id                ID
                        gl yr                year
                        gl ind                ind_2digit
                        gl Ktan                capital_tan
                        gl Kintan            capital_intan
                        gl Itan                invest_tan
                        gl Iintan            invest_intan
                        
                        * Setting thresholds to detect jumps - log values
                        loc threshold_K             0.69    
                        loc threshold_I             0.69          
                        loc com_threshold_K         0.3
                        
                        * Setup
                        tsset $id $yr, yearly
                        bysort $id ($yr): gen byte first_year = (_n == 1)
                        
                        
                        
                        * this section should have code to be run on any of the K, I, output and ratio variables
                        
                        * Generate log differences (excluding first year)
                        foreach var of varlist $Ktan $Kintan $Itan $Iintan $vars_output $vars_ratio {
                            gen log_diff_`var' = .
                            replace log_diff_`var' = log(`var'/L1.`var') ///
                                if L1.`var' > 0 & `var' > 0 ///
                                & first_year == 0
                        }
                        
                        
                        * this section should have code to be run on any of the K/I variables
                        
                        foreach var of varlist $Ktan $Kintan $Itan $Iintan {
                                
                            gen flag_jump_`var' = 0
                            
                            // Startup jumps
                            replace flag_jump_`var' = 1 ///
                                if first_year == 0 ///
                                & `var' > 0 ///
                                & (L1.`var' == 0 | L1.`var' == .) ///
                                & (L2.`var' == 0 | L2.`var' == .)
                                
                            // Drops to zero
                            replace flag_jump_`var' = 1 ///
                                if first_year == 0 ///
                                & `var' == 0 ///
                                & L1.`var' > 0    
                        }
                        
                        * this section should have code specific to the I variables
                        
                        if "$Itan$Iintan" != "" {
                            
                            foreach I_var of varlist $Itan $Iintan {
                                
                                gen log_diff_L1_`I_var' = .
                                replace log_diff_L1_`I_var' = log(L1.`I_var'/L2.`I_var') ///
                                    if L1.`I_var' > 0 & L2.`I_var' > 0
                                
                            }
                        }
                        
                        * this section should have code to run only if both K & I variables of a specific type are available
                        
                        if (`: word count $Ktan $Itan' == 2) local pair_types tan
                        if (`: word count $Kintan $Iintan' == 2) local pair_types `pair_types' intan
                        
                        if "`pair_types'" != "" {
                            
                            foreach typ of local pair_types {
                            
                                // Positive-to-positive jumps with lagged comovement
                                replace flag_jump_${K`typ'} = 1 ///
                                    if first_year == 0 ///
                                    & log_diff_${K`typ'} != . ///
                                    & abs(log_diff_${K`typ'}) > `threshold_K' ///
                                    & (log_diff_L1_${I`typ'} == . ///
                                    | abs(log_diff_${K`typ'} - log_diff_L1_${I`typ'}) > `com_threshold_K')
                        
                                replace flag_jump_${I`typ'} = 1 ///
                                    if first_year == 0 ///
                                    & log_diff_${I`typ'} != . ///
                                    & abs(log_diff_${I`typ'}) > `threshold_I'
                                
                                }  
                            }
                            else {
                                local possible_pairs tan intan
                                local missing_pairs: list possible_pairs - pair_types
                                foreach typ of local missing_pairs {
                                    if "${K`typ'}${I`typ'}" != "" {
                                        foreach var of varlist ${K`typ'} ${I`typ'} {
                                            gen d_`var' = abs(ln(F.`var') - ln(L.`var'))
                                            replace flag_jump_`var' = d_`var' > `threshold_output'
                                        }
                                    }
                                }
                            }
                        
                        
                        * end with code common to all flag_jump variables
                        
                        foreach var of varlist flag_jump_* {
                            replace `var' = 0 if missing(`var')
                        }
                        Incidentally, in general it is a bad idea to use global macros. You may want to change all the globals to locals above.
                        Last edited by Hemanshu Kumar; 04 Apr 2025, 05:28.

                        Comment


                        • #13
                          Hemanshu Kumar Thanks so much for your help, I appreciate it a lot!
                          Maybe i don't entirely understand why using global is not a good idea..

                          Comment


                          • #14
                            The principle behind avoiding global macros is need to know. Ideally a program (which in this context could be a do-file) is told what it needs to know, and nothing else.

                            Setting a global macro can have nasty side-effects anywhere in your code and those side-effects can be really hard to track down. You're basically claiming that you, and only you, could ever possibly define the global macros you defined -- and that they were and are never defined by you in ways that clash with the present intended use -- and so on and so forth.

                            The point is not met by deliberately using specific or unusual names for macros. The point is that when this bites, by definition the wrong or problematic definition could be anywhere in your code or the code that you use.

                            Now StataCorp programmers don't use global macros much, largely for this reason. The risks lie elsewhere, in the first instance the risk of shooting yourself in the foot.

                            You have implied that your code will be shared with collaborators. So, what's the scenario there? First of all, are those collaborators going to know as much Stata as you, or more Stata, or less? Let that be a rhetorical question if you wish, but you need to think about it. If you're expecting your collaborators to define global macros that give definitions for their data, then they need to understand how to do that and the risks entailed.

                            Comment

                            Working...
                            X