Help with ML Program

Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#1

Help with ML Program

02 Feb 2022, 14:49

Hello, I am trying to define and estimate the parameters from a log likelihood function using minimum wage data and unemployment and wage data from the National Agricultural Workers Survey. The individual likelihood contributions are defined as follows (from Flinn (2006): Minimum Wage Effects on Labor Market Outcomes under Search, Matching, and Endogenous Contact Rates):

For unemployed individuals

For employed individuals earning the minimum wage:

For employed individuals earning above the minimum wage:

where m denotes the minimum wage, w denotes the wage of individuals who are paid at least $0.50 above the minimum wage, rho*V_n(m) is denoted by the parameter "theta" in the code below, G() denotes the CDF of the lognormal distribution, and g() denotes the PDF of the lognormal distribution. So far I have specified my program as follows: I want to estimate mu and sigma (which are embedded in the G() and g() functions), as well as the parameters alpha, theta, lambda, and eta. Here is the code I have been using so far:

use "C:\Users\Zach\Dropbox\Confidential NAWS Data\Confidential NAWS Data 2018\Raw Files\NAWS Datasets (Stata)\nawscrtdvars1db18_STATA.dta", clear
rename *, lower
gen year = year(cs2)
merge m:1 state year using "C:\Users\Zach\Dropbox\AEWR Project\Data Files\Generated Data Files\State Minimum Wages No FY 1990-2018.dta"
drop _m
merge m:1 year using "C:\Users\Zach\Dropbox\AEWR Project\Data Files\Generated Data Files\Federal Minimum Wages No FY 1990-2018.dta"
drop _m
replace fed_min_wage = 3.35 if year==1988 | year==1989 //impute fed min wage data that is missing
replace min_wage = fed_min_wage if real_min_wage==. //impute fed min wage for states with no min wage
replace min_wage = fed_min_wage if state=="GA" | state=="WY" //impute fed min wage for states with lower min wage
replace fwweeks = 52 if fwweeks>52 //round down weeks for leap year
replace fwweeks = fwweeks/52 //normalize work weeks to 52 weeks = 1
gen unemployed = fwweeks < 52 //identify workers who were unemployed during year
gen ti = 52 - fwweeks if fwweeks<52 //identify length of unemployment spell
gen paid_min_wage = waget1<=(min_wage + .50) //identify individuals paid about the min wage
replace waget1 = min_wage if waget1<=(min_wage+.50) //replace wages for individuals who make min wage + .50
gen paid_above_min_wage = waget1>(min_wage+.50) //identify individuals paid above min wage
gen ln_mw=ln(min_wage) //generate the log of min wage
gen waget1_hi = waget1 if waget1>(min_wage +.50) //generate wages for individuals who make above min
gen ln_waget1_hi = ln(waget1) if waget1>(min_wage +.50) //take log of wages for those above min wage
replace paid_min_wage=0 if unemployed==1 //classify individuals as not paid min wage if classified as unemployed
replace paid_above_min_wage=0 if unemployed==1 //classifiy individual as not paid hi wage if classified as unemployed
program drop AEWR_1
program define AEWR_1
version 1.0
args llf mu sigma alpha theta lambda eta
tempvar G1 G2 G3 T1 T2 T3 T4 T5 T6
generate double `G1' = normal(sqrt((ln_mw - `mu')/`sigma')^2)
generate double `G2' = normal([(ln_mw - (1 - `alpha')*`theta')/`alpha' - `mu']/`sigma')
generate double `G3' = normalden([(ln_waget1_hi - (1 - `alpha')*`theta')/`alpha' - `mu']/`sigma')/(`sigma'*waget1_hi)
generate double `T1' = ln(sqrt(`lambda')^2) - ln(sqrt(`eta' + `lambda'*`G1')^2)
generate double `T2' = ln(sqrt(`eta')^2) + ln(sqrt(`G1')^2) if unemployed==1
generate double `T3' = -sqrt(`lambda'*`G1'*ti)^2 if unemployed==1
generate double `T4' = ln(sqrt(`G1' - `G2')^2) if paid_min_wage==1
generate double `T5' = -ln(sqrt(`alpha')^2) if paid_above_min_wage==1
generate double `T6' = ln(sqrt(`G3')^2) if paid_above_min_wage==1
quietly replace `llf' = `T1' + `T2' + `T3' if unemployed==1
quietly replace `llf' = `T1' + `T4' if paid_min_wage==1
quietly replace `llf' = `T1' + `T5' + `T6' if paid_above_min_wage==1
end

ml model lf AEWR_1 () () () () () ()
ml check
ml search, repeat()
ml maximize, iterate(50) difficult
ml graph
exit

I get this message indicating that feasible values cannot be found.

First of all, I am not sure if I should be using the lf model or if I should switch to using the d0 estimator. Also, I am not sure if I am specifying the equations correctly in terms of the () () ... () after the ml model lf AEWR_1 command. At this point, I do not want my dependent variables (i.e., ln_mw, ln_waget1_hi, and ti) to depend on other variables, which is why I left the () without an equation in them. Is this how I should be doing it? Any help you could provide would be greatly appreciated, as I have spent the past two days trying to figure this out.

Attached Files
Tags: None
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#2

02 Feb 2022, 18:55

I noticed an error with my previous code, which normalized the weeks worked to 1 for full year. Even after correcting that error, I get the same result.

Specifically, I deleted the line of code that reads: replace fwweeks = fwweeks/52 //normalize work weeks to 52 weeks = 1
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3404
#3

03 Feb 2022, 09:02

The code is too complicated and too local to your data for us the easily see what the problem is. In general, my strategy would be to simplify the model a lot, and add complications one at the time till you find the problem.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#4

03 Feb 2022, 11:11

Hi Maarten,
Thanks for the response. I will do as you suggest. Based on the way the likelihood contributions are defined, can you speak to whether the "lf" estimator would be able to handle this type of optimization problem, or if I would need to use the d0 or d1 estimator instead. Similarly, it is unclear to me whether the way I have specified the ml model command with the "() () ... ()" equation specifications is appropriate. Specifically, some of the parameters enter into different parts of the log likelihood function, so it's unclear to me whether I need to specify a specific variable to be associated with them, if I can specify two variables associated with them, and how I should go about doing that. Any help or input would be greatly appreciated. Thanks.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2408
#5

03 Feb 2022, 12:00

Hi Zachariah
LF should be enough for your purposes. The others may be more efficient and faster, but for a first run, lf its good enough. (many of my own programs are based on this).
Regarding the ML model command line. Im never tried using just empty parenthesis. I always prefer to give it names, to know what each parameter represent:

ml model lf AERW_1 (mu (sigma etc etc

Something to consider tho. As many other nonlinear estimations, Solutions to this type of model depend strongly on initial conditions. So, bad initial conditions may cause an endless loop.
That being said. Two options you could try here:
1. Use better initial conditions, using "init()". That may speed things up, IF you know what are good values for your parameters.
2. I notice you are modeling "sigma" . This can have very bad convergence properties. I would suggest using "lnsigma" instead. And then transform it back.

HTH
Comment
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#6

03 Feb 2022, 12:38

Hi Fernando,
Thank you for the response. Your input is very helpful. I will try your suggestions to see if they help me resolve my issue.
Comment
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#7

03 Feb 2022, 12:47

Hi Fernando, can you please elaborate on how I should specify lnsigma in my code? Should I use ln(`sigma') in place of `sigma' or something else?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2408
#8

03 Feb 2022, 15:48

Hi Zachariah
Consider the very simple OLS via ML

This will work, but will be problematic:

Code:

program myols1 args lnf xb sigma qui:replace `lnf'=log(normalden(`xb',`sigma')) end

This will work better, because avoids, for example Sigma=0 or Sigma<0

Code:

program myols2 args lnf xb lnsigma qui:replace `lnf'=log(normalden(`xb', exp(`lnsigma') )) end

So, you have to estimate a parameter of lnsigma, but use exp(sigma) within the estimator.
Comment
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#9

03 Feb 2022, 17:09

Hi Fernando, excellent! Thanks for following up. That seems like a nice solution. I have one more question about the () () ... () in the ml model command, though. Do I need to associate each of the parameters with one of the variables in the LLF, or can I just leave them unassociated with variables. I'm unclear what the purpose of the () () ... () is for running the ml model command in Stata. If you could provide some insight into that, I would be very grateful.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2408
#10

03 Feb 2022, 19:16

Yes, at least that is how I have always treated those parameters.
In the model I proposed before, for example, i would call them as follows:
ml model lf myols1 (xb: wage = x1 x2 x3) /lnsigma
or
ml model lf myols1 (xb: wage = x1 x2 x3) (lnsigma: )
I do this, a) to remember what each parameter information is, and b) because ML gets those parameters (as i understand) to sort what controls go to which model.
Hope this helps.
Fernando

Last edited by FernandoRios; 03 Feb 2022, 19:36.
Comment
Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#11

04 Feb 2022, 00:19

Hi Fernando,
Yes that helps, but my some of the parameters in the LLF are associated with more than one variable, and I'm not sure which variables the other parameters are directly associated with as they simply appear in parts of the LLF that are not within a standard framework like OLS or the normal distribution. Also, I am not controlling for other variables in my model (at least while I try to get an initial optimization to work), so I'm unsure how I should be specifying the () () .. () equations given my setting. Do you have any ideas for me about how I should specify the () () ... () in my setting?
Comment

FernandoRios

Join Date: Apr 2014
Posts: 2408

#12

04 Feb 2022, 05:38

Well, the way i see your program:

Code:

program define AEWR_1
version 1.0
args llf mu sigma alpha theta lambda eta
tempvar G1 G2 G3 T1 T2 T3 T4 T5 T6
generate double `G1' = normal(sqrt((ln_mw - `mu')/`sigma')^2)
generate double `G2' = normal([(ln_mw - (1 - `alpha')*`theta')/`alpha' - `mu']/`sigma')
generate double `G3' = normalden([(ln_waget1_hi - (1 - `alpha')*`theta')/`alpha' - `mu']/`sigma')/(`sigma'*waget1_hi)
generate double `T1' = ln(sqrt(`lambda')^2) - ln(sqrt(`eta' + `lambda'*`G1')^2)
generate double `T2' = ln(sqrt(`eta')^2) + ln(sqrt(`G1')^2) if unemployed==1
generate double `T3' = -sqrt(`lambda'*`G1'*ti)^2 if unemployed==1
generate double `T4' = ln(sqrt(`G1' - `G2')^2) if paid_min_wage==1
generate double `T5' = -ln(sqrt(`alpha')^2) if paid_above_min_wage==1
generate double `T6' = ln(sqrt(`G3')^2) if paid_above_min_wage==1
quietly replace `llf' = `T1' + `T2' + `T3' if unemployed==1
quietly replace `llf' = `T1' + `T4' if paid_min_wage==1
quietly replace `llf' = `T1' + `T5' + `T6' if paid_above_min_wage==1
end

should be called:

Code:

ml model lf AEWR_1 (mu: )  (sigma: )  ( alpha : )  (theta : )  (lambda : )  (eta: )

because each one of this is a unique parameter.

And i see everything else is already inside the program. so no need to be specified.

In other words, if your first line for the arguments is:
args lnf x1 x2 x3 x4
The parameters in the model that will need to be "named" in the parenthesis are (x1: ) (x2: ) (x3: ) (x4: )
HTH

EDIT
I just tried to see what happens when you simply use parenthesis "()". And it simply adds a generic name to that parameter. Specifically "EQ#", where # represents the number of the equation.

Last edited by FernandoRios; 04 Feb 2022, 06:24.

Comment

Zachariah Rutledge

Join Date: Jun 2019

Posts: 31
#13

04 Feb 2022, 10:01

Hi Fernando,
Okay, That's what I thought. That is super helpful. And once again, thanks for your help.

Cheers,
Zach
Comment

Announcement