How to use lnmvnormalden(M,V,X) function (natural logarithm of the multivariate normal density)?

Stephen Jenkins

Join Date: Apr 2014
Posts: 1422

How to use lnmvnormalden(M,V,X) function (natural logarithm of the multivariate normal density)?

27 Sep 2019, 04:43

I need to calculate the (logs of) bivariate normal densities for a large number of observations on two variables R and S (this is part of some likelihood evaluator code). My searching suggests that I should use the lnmvnormalden function. (I'm using Stata code in the evaluator rather than Mata code.) The online help description is as follows:

Code:

lnmvnormalden(M,V,X)
       Description:  the natural logarithm of the multivariate normal density

                     M is the mean vector, V is the covariance matrix, and X
                     is the random vector.
       Domain M:     1 x n and n x 1 vectors
       Domain V:     n x n, positive-definite, symmetric matrices
       Domain X:     1 x n and n x 1 vectors
       Range:        -8e+307 to 8e+307

I'm assuming that in the bivariate case, that "n" = 2. But, if that is the case, how do I calculate the log densities for a large number of observations? I was looking for functionality analogous to the use of lnnormalden in the univariate normal density case -- with this one can feed the function a variable name and calculate logs of densities for a large number of observations (as shown in the example below).

Code:

lnnormalden(x,m,s)
Description: the natural logarithm of the normal density with mean m and standard deviation s

lnnormalden(x,0,s) = lnnormalden(x,s) and lnnormalden(x,m,s) = lnnormalden((x-m)/s)
- ln(s)
Domain x: -8e+307 to 8e+307
Domain m: -8e+307 to 8e+307
Domain s: 1e-323 to 8e+307
Range: 1e-323 to 8e+307

My attempts to use lnmvnormalden are shown below, and you'll see that I am getting missing values even in a case in which I think the matrices and vectors are compatible with the help description about their dimensions. At the end of the post is a dataex extract of the data set I've been using.

I suspect that I am misunderstanding something fundamental about lnmvnormalden and would appreciate being put right. Thanks. (Stata version 15.1 on a Windows server.)

Code:

. mkmat R S, matrix(X)

. mat list X

X[50,2]
             R          S
 r1  9.5391407  9.4915266
 r2  9.6193991  9.7069855
 r3  10.415143  10.585219
 r4  9.5922642  9.5764217
 r5  9.4610987  9.5772448
 r6  9.5327864   9.477334
 r7   8.244071  7.9390125
 r8  9.4064007  9.4318829
 r9  9.1348619  9.1652555
r10  9.3621168  9.3593121
r11  9.6324663  9.2575293
r12   9.980217  9.9595852
r13  10.310984  10.339585
r14  9.9761333   9.931035
r15  9.5503778  9.5343771
r16  8.4362001  8.4235115
r17  8.4213428  8.4067535
r18  10.700206   10.55044
r19  9.5288668  9.4493494
r20  8.4703112  9.3805895
r21  8.1053076  8.8629341
r22  9.7116613   9.647891
r23  10.173896  9.8595171
r24  10.020025  10.067356
r25  10.999881  11.042206
r26  10.408376  10.587072
r27  9.3610849  9.3850803
r28  10.240103  10.201797
r29  9.7658339   9.614254
r30  9.3755159  9.4409666
r31  9.8051577   9.865303
r32  8.1886892  8.1493082
r33  9.2007952  9.1513138
r34  8.4523344  9.4964209
r35  8.7395363  8.7519808
r36  9.8634462  9.9078016
r37  8.7152243  8.5628929
r38  9.9405422  9.9521141
r39  9.8146019  9.8067474
r40  9.9920931  10.283386
r41  10.207658  10.085143
r42  10.494436  10.502324
r43  9.5510178  8.7480259
r44  8.7528973  8.5524721
r45  10.170954   10.15915
r46  9.9867716  10.123063
r47  9.8254719  9.8778028
r48  9.9557476  9.8533163
r49  8.8417377  9.2087393
r50  10.170341  10.035743

.
. matrix Y = (10.170341 , 10.035743)  // last row of X

. mat list Y  

Y[1,2]
           c1         c2
r1  10.170341  10.035743

.
. matrix M = (9.8 , 9.8)

. mat list M

M[1,2]
     c1   c2
r1  9.8  9.8

.
. scalar v11 = .6

. scalar v22 = (1-.4)^2*(.6)^2 + (.3)^2

. scalar v12 = ((1-.4)^2*(.6)^2)/( .6* sqrt( (1-.4)^2*(.6)^2 + (.3)^2 )  )

. matrix V = (v11, v12 \ v12, v22)

. mat list V

symmetric V[2,2]
           c1         c2
r1         .6
r2  .46093277      .2196

.
. mat dir
            V[2,2]
            M[1,2]
            Y[1,2]
            X[50,2]

.
. gen v1 = normalden(R, 9.8, .6)

. su v1

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v1 |         50    .4439967    .2291873   .0123145   .6648793

.
. ge v2 = lnnormalden(R, 9.8, .6)

. su v2

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v2 |         50   -1.141031    1.069473  -4.396977  -.4081499

.
. ge v3 = lnmvnormalden(M, V, X)
(50 missing values generated)

. su v3

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v3 |          0

.
. ge v4 = lnmvnormalden(M, V, Y)
(50 missing values generated)

. su v4

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v4 |          0

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(R S)
 9.539141  9.491527
 9.619399  9.706985
10.415143  10.58522
 9.592264  9.576422
 9.461099  9.577245
 9.532786  9.477334
 8.244071  7.939013
 9.406401  9.431883
 9.134862  9.165256
 9.362117  9.359312
 9.632466  9.257529
 9.980217  9.959585
10.310984 10.339585
 9.976133  9.931035
 9.550378  9.534377
   8.4362 8.4235115
 8.421343  8.406754
10.700206  10.55044
 9.528867  9.449349
 8.470311 9.3805895
 8.105308  8.862934
 9.711661  9.647891
10.173896  9.859517
10.020025 10.067356
 10.99988 11.042206
10.408376 10.587072
 9.361085   9.38508
10.240103 10.201797
 9.765834  9.614254
 9.375516  9.440967
 9.805158  9.865303
 8.188689  8.149308
 9.200795  9.151314
 8.452334  9.496421
 8.739536  8.751981
 9.863446  9.907802
 8.715224  8.562893
 9.940542  9.952114
 9.814602  9.806747
 9.992093 10.283386
10.207658 10.085143
10.494436 10.502324
 9.551018  8.748026
 8.752897  8.552472
10.170954  10.15915
 9.986772 10.123063
 9.825472  9.877803
 9.955748  9.853316
 8.841738  9.208739
10.170341 10.035743
end

Tags: None

John Mullahy

Join Date: Dec 2016

Posts: 736
#2

27 Sep 2019, 05:55

Stephen: In your particular example the V matrix you specify is not positive definite. I wonder if this may be what is causing the result?
Comment

Stephen Jenkins

Join Date: Apr 2014
Posts: 1422

27 Sep 2019, 07:33

Thanks, John: very helpful observation. If I change matrix V to be the 2x2 identity matrix (to guarantee positive definitiveness in this toy example), I get a non-missing return from the lnmvnormalden function call when I supply it matrix Y (2x1) as an argument (see below). However, I still get missing values if I supply it X (50x1) -- mostly likely because 50 > 2. Thus I am still still left with the problem of how to calculate multivariate log densities for lots of obs!

[PS for those interested, I am trying to fit to UK data the measurement error models used by Kapteyn & Ypma ("Measurement error and misclassification: a comparison of survey and administrative data", Journal of Labor Economics, 2007, 25(3), 523-551]

Code:

. matrix M = (9.8 , 9.8)

. mat list M

M[1,2]
     c1   c2
r1  9.8  9.8

. 
. 
. scalar v11 = 1

. scalar v22 = 1

. scalar v12 = 0

. matrix V = (v11, v12 \ v12, v22)

. mat list V

symmetric V[2,2]
    c1  c2
r1   1
r2   0   1

. 
. matrix symeigen ev lambda = V

. mat list lambda

lambda[1,2]
    e1  e2
r1   1   1

. 
. mat D = det(V)

. mat li D

symmetric D[1,1]
    c1
r1   1

. 
. 
. mat dir
            D[1,1]
       lambda[1,2]
           ev[2,2]
            V[2,2]
            M[1,2]
            Y[1,2]
            X[50,2]

. 
. gen v1 = normalden(R, 9.8, .6)

. su v1

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v1 |         50    .4439967    .2291873   .0123145   .6648793

. 
. ge v2 = lnnormalden(R, 9.8, .6)

. su v2

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v2 |         50   -1.141031    1.069473  -4.396977  -.4081499

. 
. ge v3 = lnmvnormalden(M, V, X)
(50 missing values generated)

. su v3

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v3 |          0

. 
. ge v4 = lnmvnormalden(M, V, Y)

. su v4

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          v4 |         50   -1.934241           0  -1.934241  -1.934241

Comment

Stephen Jenkins

Join Date: Apr 2014

Posts: 1422
#4

01 Oct 2019, 02:23

In #3, I wrote

Thus I am still still left with the problem of how to calculate multivariate log densities for lots of obs!

Isabel Canette of StataCorp has kindly pointed me towards the Mata function __lnmvnormalden() to do what I want. (Note the 2 underscores at the front of the name.) I somehow missed this function when looking for it. But now I see that it's documented via help mata __lnmvnormalden()

Illustrative code:

Code:

cscript input float(R S) 9.825472 9.877803 9.955748 9.853316 8.841738 9.208739 10.170341 10.035743 end matrix Y = (10.170341 , 10.035743) // last row of X mat M = (9.8, 9.8) mat V = (.8, 0 \ 0, .7) local a = lnmvnormalden(M, V, Y) display "`a'" putmata X1 =(R S) mata: M1 = st_matrix("M") V1 = st_matrix("V") myeval = __lnmvnormalden(M1,V1,X1) myeval end getmata a = myeval list
1 like
Comment
John Mullahy

Join Date: Dec 2016

Posts: 736
#5

01 Oct 2019, 06:01

Thanks for sharing this, Stephen.

Yet another good example of an "undocumented" feature of Stata that should perhaps be documented. See

Code:

help undocumented

in which __lnmvnormalden() is listed.
Comment

Announcement

How to use lnmvnormalden(M,V,X) function (natural logarithm of the multivariate normal density)?

Comment

Comment

Comment

Comment