Originally posted by Tommaso Salvitti
View Post
Both of your peer outcome variables* seem to be so-called limited dependent variables (LDVs), ordered categorical in particular, with both floors and ceilings. You might want to Google for methods suitable for assessing association with LDVs.
Stas Kolenikov's polychoric has a limit of I think 10 categories before it automatically switches over to providing the polyserial correlation, but you can use gsem to compute the polychoric correlation in such cases. I illustrate its use in your case below (the syntax has changed slightly since that post years ago, because Stata has since changed the manner in which to reference constant equations in estimation commands).
Code:
version 18.0 clear * quietly input byte(EMOTIV ados_todtoddler_totale) <redacted for brevity> end rename (EMOTIV ados_todtoddler_totale ) (emo ado) contract _all, freq(count) gsem (emo@1 ado@1 <- F, oprobit) [fweight=count], nocnsreport nodvheader nolog nlcom rho:_b[/var(F)] / (1 + _b[/var(F)]) // <= here estimates store Full quietly gsem (emo@1 ado@1 <- , oprobit) [fweight=count] lrtest Full exit
I'm not sure what's convinced you that the relationship between whatever these two variables measure is nonmonotonic—I'm not sold on what I see with lowess on the subset of 90 nonmissing observations that you provide above, but you might have seen something more compelling with the complete dataset or have reasons based in theory or prior experience.
* One's variable name suggests that it's the total score on the Toddler Module of the ADOS-2 instrument. Google suggests from the other's that it's some kind of electroencephalographic measurement.
Comment