Good morning,
I wrote a Mata function that uses panelsetup() to do something by groups, and it (seemed that it) is miraculously working after I spent almost 40 hours debugging it.
I tried it on another dataset, and it failed with the message
After scratching my head for another business day, I finally figured out what is the problem: When I have missing values in my data, I exclude those missing values with my -mark- command, and from thereafter I am not able to stick back my results in my data because the dimension of my results vector and my original data are different. So the question is, how do you stick your Mata results matrix back to your Stata data, when you have missing values and because of this the dimensionalities start to differ?
Here is an example of the problem. In the following Mata code I write a Mata function that replicates what -egen, min()- and -egen,max()- do, that is, it calculates min and max by groups.
The function works fine when I do not have missing data, and indeed produces the same results at the -egen, min()- and -egen,max()-.
so far so good.
But now when I exclude some missing values with my -mark- statement, it all falls to pieces:
So what do we do here? What is the solution to moving variables back to Stata from Mata when there are missing values and this makes the dimensionality of objects different?
I wrote a Mata function that uses panelsetup() to do something by groups, and it (seemed that it) is miraculously working after I spent almost 40 hours debugging it.
I tried it on another dataset, and it failed with the message
Code:
st_store(): 3200 conformability error myquantile_by(): - function returned error <istmt>: - function returned error r(3200);
Here is an example of the problem. In the following Mata code I write a Mata function that replicates what -egen, min()- and -egen,max()- do, that is, it calculates min and max by groups.
The function works fine when I do not have missing data, and indeed produces the same results at the -egen, min()- and -egen,max()-.
Code:
clear clear mata mata: void function mean_by_store(string scalar var, string scalar groupid, string scalar touse) { real scalar i, j, j0, j1, min, max real colvector id, y real matrix info, result string colvector minmax minmax = ("min" \ "max") id = st_data(., groupid, touse) y = st_data( ., var, touse) result = J(rows(y),2,.) info=panelsetup(id, 1) for (i=1; i<=rows(info); i++) { j0 = info[i, 1] j1 = info[i, 2] min = min(y[|j0\j1|]) max = max(y[|j0\j1|]) for (j=j0; j<=j1; j++) { result[j,1] = min result[j,2] = max } } for (j=1; j<=2; j++) st_store(., st_addvar("double", "my_"+minmax[j]), result[,j]) } end sysuse auto keep price rep sort rep mark touse mata: mean_by_store("price", "rep", "touse") egen min = min(price), by(rep) egen max = max(price), by(rep) summ my_min min my_max max . summ my_min min my_max max Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- my_min | 74 3589.203 261.2575 3291 4195 min | 74 3589.203 261.2575 3291 4195 my_max | 74 13178.01 2871.961 4934 15906 max | 74 13178.01 2871.961 4934 15906
But now when I exclude some missing values with my -mark- statement, it all falls to pieces:
Code:
. sysuse auto, clear (1978 Automobile Data) . . keep price rep . . sort rep . . mark touse if !missing(rep) . . mata: mean_by_store("price", "rep", "touse") st_store(): 3200 conformability error mean_by_store(): - function returned error <istmt>: - function returned error r(3200); end of do-file r(3200);
Comment