While toying with a simulation, I noticed that lpoly was systematically failing to recognize the true linear relationship between two variables. This is visually obvious when the lpoly line is compared to a scatterplot or to the lowess line, as I demonstrate in a simpler example below. The problem must stem from the lpoly default settings, as the "localp" command, which I understand to be a particular set-up of lpoly, does not suffer from the same problem.
Can anyone explain which default settings within lpoly are causing this systematic bias? I want to be sure that I avoid this problem if using lpoly in future. Thank you!
Can anyone explain which default settings within lpoly are causing this systematic bias? I want to be sure that I avoid this problem if using lpoly in future. Thank you!
Code:
clear drop _all set obs 10000 set seed 12345 gen larea = rnormal(-1.5, 1.2) /* original variable */ gen error = rnormal(0, .4) /* error to be added around larea */ gen larea_me_big = -.3 + .8*larea + error /* larea is tilted, and noise is added */ gen larea_me_sm = -.15 + .9*larea + error /* smaller tilt, same noise */ gen larea_me_rnd = larea + error /* only noise is added, no tilt */ /* In each of the plots below, lpoly is missing the correct and obvious linear relationship between the noisy variable and the original variable, larea */ two (scatter larea_me_big larea, mcolor(gray*.5) msize(tiny)) /// (lowess larea_me_big larea, lcolor(blue)) /// (lpoly larea_me_big larea, lcolor(green)) /// (line larea larea, lcolor(black)), /// legend(order(1 "data" 2 "lowess fit" 3 "lpoly fit" 4 "45 degrees")) two (scatter larea_me_sm larea, mcolor(gray*.5) msize(tiny)) /// (lowess larea_me_sm larea, lcolor(blue)) /// (lpoly larea_me_sm larea, lcolor(green)) /// (line larea larea, lcolor(black)), /// legend(order(1 "data" 2 "lowess fit" 3 "lpoly fit" 4 "45 degrees")) two (scatter larea_me_rnd larea, mcolor(gray*.5) msize(tiny)) /// (lowess larea_me_rnd larea, lcolor(blue)) /// (lpoly larea_me_rnd larea, lcolor(green)) /// (line larea larea, lcolor(black)), /// legend(order(1 "data" 2 "lowess fit" 3 "lpoly fit" 4 "45 degrees")) /* yet localp DOES recognize the correct linear relationship */ localp larea_me_rnd larea
Comment