Hi everyone! The following question is based on a test that I am trying to run. Although I have included all possible details, please let me know if you need any other info!
Data:
My data contains the following variables: URBAN, AGE, SEX, NCHILD, FAMSIZE, Under14 (treated variable which is 1 for age < 14, and 0 otherwise), LIT (outcome), Post1986 (post-term). I am running a DiD model.
Situation:
I am trying to run an Exact Randomization Test to check the robustness of my data, for which I have written my own code. The process goes something like this:
The YEAR column has 1983, 1987, 1993, 1999, 2004, 2009. The Post1986 term is 1 for years after 1986 and 0 otherwise (including 1986). I used each of the given years as a placebo year and created a new data set before proceeding with the ritest package; that is, I did it 6 times. For this purpose, I had to create a new variable each time, PostXXXX, where XXXX is the year, following the same logic, 1 for years after it and 0 otherwise (and including XXXX). I understand that my data has no 1986 year, which is, in fact, the treatment year, but that's just how it is.
My end objective is to get 'n' probability values for the n-simulations of regressions, which I can then plot separately as a cumulative probability distribution for each placebo year. The y-axis would have the probabilities, and the x-axis would have the interaction term (post*treat) for each simulation for that placebo year. Additionally, a vertical red line on the plot cutting the distribution would represent the actual interaction term for the actual regression (Under14*Post1986).
If the vertical line cuts the probability curve at an extremity, then it would mean that the actual estimate is an outlier in the distribution, hence validating that the intervention is unlikely to have occurred randomly.
However, due to collinearity issues among some variables in the process, I had to first randomize the treated and control units (keeping the number of units in each group constant) and then proceed further.
Error:
I am always getting this error: expression list required r(100);
I pinpointed the line of code for which I am getting it; it is in the place where the ritest is being run:
ritest randomized_Under14, stat(_b[Post`year'_randomized]) reps(1)
Despite multiple attempts, I am not able to understand the exact reason why this is happening. What I feel is that while I am randomizing the Under14 variable, I am only doing so for that column and not for the ones dependent on it, such as AGE. I mean, when randomly making (Under14 = 1) for a unit, the code fails to account for the corresponding unit's AGE, which maybe 17. But its a very minor issue, and there might be some other reason for this error.
Can anyone please help me out with this error? I understand it's a very long question, but I am stuck at the moment! Thank you!
Data:
My data contains the following variables: URBAN, AGE, SEX, NCHILD, FAMSIZE, Under14 (treated variable which is 1 for age < 14, and 0 otherwise), LIT (outcome), Post1986 (post-term). I am running a DiD model.
Situation:
I am trying to run an Exact Randomization Test to check the robustness of my data, for which I have written my own code. The process goes something like this:
The YEAR column has 1983, 1987, 1993, 1999, 2004, 2009. The Post1986 term is 1 for years after 1986 and 0 otherwise (including 1986). I used each of the given years as a placebo year and created a new data set before proceeding with the ritest package; that is, I did it 6 times. For this purpose, I had to create a new variable each time, PostXXXX, where XXXX is the year, following the same logic, 1 for years after it and 0 otherwise (and including XXXX). I understand that my data has no 1986 year, which is, in fact, the treatment year, but that's just how it is.
My end objective is to get 'n' probability values for the n-simulations of regressions, which I can then plot separately as a cumulative probability distribution for each placebo year. The y-axis would have the probabilities, and the x-axis would have the interaction term (post*treat) for each simulation for that placebo year. Additionally, a vertical red line on the plot cutting the distribution would represent the actual interaction term for the actual regression (Under14*Post1986).
If the vertical line cuts the probability curve at an extremity, then it would mean that the actual estimate is an outlier in the distribution, hence validating that the intervention is unlikely to have occurred randomly.
However, due to collinearity issues among some variables in the process, I had to first randomize the treated and control units (keeping the number of units in each group constant) and then proceed further.
Code:
local years 1983 1987 1993 1999 2004 2009 local reps 1000 * Store the original number of treated and control units gen original_treated = Under14 gen original_control = 1 - Under14 local treated_count = sum(original_treated) local control_count = sum(original_control) foreach year in `years' { forval i = 1/`reps' { * Randomize treated and control groups while maintaining original proportions gen random_assign = runiform() sort random_assign * Assign treated and control groups gen randomized_Under14 = 0 replace randomized_Under14 = 1 if _n <= `treated_count' * Interaction term for placebo year generate Post`year'_randomized = Post`year' * randomized_Under14 * Run ritest with the randomized groups ritest randomized_Under14, stat(_b[Post`year'_randomized]) reps(1): /// regress LIT Post`year'##randomized_Under14 URBAN AGE SEX NCHILD FAMSIZE randomized_Under14 Post`year' i.state_encoded } * Save results for the placebo year save results_randomized_`year', replace } * Visualization code (as given earlier) foreach year in `years' { use results_randomized_`year', clear * Generate cumulative distribution gen cdf = _n / _N twoway (line cdf stat, sort) /// (vline `actual_coeff', lcolor(red)), /// title("Cumulative Distribution for `year'") /// xtitle("Simulated Coefficient") /// ytitle("Cumulative Probability") }
I am always getting this error: expression list required r(100);
I pinpointed the line of code for which I am getting it; it is in the place where the ritest is being run:
ritest randomized_Under14, stat(_b[Post`year'_randomized]) reps(1)
Despite multiple attempts, I am not able to understand the exact reason why this is happening. What I feel is that while I am randomizing the Under14 variable, I am only doing so for that column and not for the ones dependent on it, such as AGE. I mean, when randomly making (Under14 = 1) for a unit, the code fails to account for the corresponding unit's AGE, which maybe 17. But its a very minor issue, and there might be some other reason for this error.
Can anyone please help me out with this error? I understand it's a very long question, but I am stuck at the moment! Thank you!
Comment