Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Understanding pointers on views of temporary variables

    Could someone help me put logic into the behavior of pointers, when they point on views of temporary variables?

    Here is a toy example:
    Code:
    clear all
    sysuse auto
    
    program define testprog
        tempvar touse nonsense weight
        gen byte `touse' = 1
        gen byte `nonsense' = 0
        gen `weight' = weight
        mata: st_view(p = ., ., "`weight'", "`touse'")
        mata: pp = &p
        mata: mean(*pp)
        if `0' {
            drop `nonsense'
        }
    end
    
    testprog 0
    mata: mean(*pp)
    
    testprog 1
    mata: mean(*pp)
    p is a Mata view on a temporary variable, which contains the content of the variable weight. pp is a pointer on this view. The above code computes the mean of whatever pp points to 4 times. In the first 3 instances, it correctly returns the mean of the variable weight. I have no idea what it returns in the last instance.

    I am aware of the issue that Mata views are based on variable indices, not variable names. When temporary variables are deleted, the view supposedly switches to a different variable. What I do not understand is the following:
    1. Why does the second output still equal the mean of weight, even though it is computed after the temporary variables created in the program testprog are deleted when the program terminates?
    2. Why does it make a difference whether I explicitly delete a temporary variable at the end of a program or not?
    3. What is the pointer pointing to when called the last time? There is no variable in the data set with the mean that is returned here.

    Any insights would be appreciated.
    Last edited by Sebastian Kripfganz; 25 Jul 2024, 08:36.
    https://twitter.com/Kripfganz

  • #2
    I have no answer (after about 10 minutes), but I believe pointers might unnecessarily complicate the investigation. The pointer points to the view every time. It's hard to see why the view changes in the last instance, and indeed what is is view-ing:

    Code:
    clear all
    sysuse auto
    
    program define testprog
        tempvar touse nonsense weight
        gen byte `touse' = 1
        gen byte `nonsense' = 0
        gen `weight' = weight
        mata: st_view(p = ., ., "`weight'", "`touse'")
        *mata: pp = &p
        *mata: mean(*pp)
        */
        mata : p[1::10] // <- new
        if `0' {
            drop `nonsense'
        }
        mata : p[1::10] // <- new
    end
    
    testprog 0
    // mata: mean(*pp)
    
    testprog 1
    // mata: mean(*pp)
    yields

    Code:
    . testprog 0
               1
         +--------+
       1 |  2930  |
       2 |  3350  |
       3 |  2640  |
       4 |  3250  |
       5 |  4080  |
       6 |  3670  |
       7 |  2230  |
       8 |  3280  |
       9 |  3880  |
      10 |  3400  |
         +--------+
               1
         +--------+
       1 |  2930  |
       2 |  3350  |
       3 |  2640  |
       4 |  3250  |
       5 |  4080  |
       6 |  3670  |
       7 |  2230  |
       8 |  3280  |
       9 |  3880  |
      10 |  3400  |
         +--------+
    
    . // mata: mean(*pp)
    .
    . testprog 1
               1
         +--------+
       1 |  2930  |
       2 |  3350  |
       3 |  2640  |
       4 |  3250  |
       5 |  4080  |
       6 |  3670  |
       7 |  2230  |
       8 |  3280  |
       9 |  3880  |
      10 |  3400  |
         +--------+
                      1
         +---------------+
       1 |  3155.445313  |
       2 |  3157.085938  |
       3 |    3154.3125  |
       4 |  3156.695313  |
       5 |    3159.9375  |
       6 |  3158.335938  |
       7 |  3152.710938  |
       8 |    3156.8125  |
       9 |   3159.15625  |
      10 |   3157.28125  |
         +---------------+
    
    . // mata: mean(*pp)

    Comment


    • #3
      Something is going on with the storage type of the temporary variables; can't figure out the details:

      Code:
      . program define testprog
        1.     tempvar touse nonsense weight
        2.     gen byte `touse' = 1
        3.     gen int `nonsense' = 0 // <- change byte to int
        4.     gen `weight' = weight
        5.     mata: st_view(p = ., ., "`weight'", "`touse'")
        6.     *mata: pp = &p
      .     *mata: mean(*pp)
      .     */
      .     mata : p[1::10] // <- new
        7.     if `0' {
        8.         drop `nonsense'
        9.     }
       10.     mata : p[1::10] // <- new
       11. end
      
      . 
      . testprog 0
      (output omitted)
      
      . testprog 1
                 1
           +--------+
         1 |  2930  |
         2 |  3350  |
         3 |  2640  |
         4 |  3250  |
         5 |  4080  |
         6 |  3670  |
         7 |  2230  |
         8 |  3280  |
         9 |  3880  |
        10 |  3400  |
           +--------+
                        1
           +---------------+
         1 |  2932.325928  |
         2 |  3348.332275  |
         3 |  2644.321533  |
         4 |  3252.330811  |
         5 |  4084.343506  |
         6 |  3668.337158  |
         7 |  2228.315186  |
         8 |  3284.331299  |
         9 |  3876.340332  |
        10 |  3396.333008  |
           +---------------+
      See how an int gets you closer than a byte? A float or double gets you exactly there (not shown here, try for yourself),

      Comment


      • #4
        Thank you, Daniel. You are right that this has nothing to do with pointers; that's somewhat reassuring at least. The observation that this behavior depends on the storage type is an interesting one.

        The other thing I still do not understand is why it makes a difference to explicitly delete a temporary variable at the end of a program.

        EDIT: If you change the sort order of the variables by, say, adding option before(price) to the command generating the temporary variable `weight', then the view switches correctly to the next variable (price) after terminating the program, as in my initial code. In your modified code, the view would remain unchanged (as long as its content is assessed within the program). The problem documented above only arises when the temporary variable is ordered last (which is the typical case). The variable index is then out of range after the temporary variable ceased to exist.
        Last edited by Sebastian Kripfganz; 26 Jul 2024, 03:50.
        https://twitter.com/Kripfganz

        Comment


        • #5
          Originally posted by Sebastian Kripfganz View Post
          The other thing I still do not understand is why it makes a difference to explicitly delete a temporary variable at the end of a program.

          EDIT: If you change the sort order of the variables by, say, adding option before(price) to the command generating the temporary variable `weight', then the view switches correctly to the next variable (price) after terminating the program, as in my initial code. In your modified code, the view would remain unchanged (as long as its content is assessed within the program).
          I also noticed that the behavior depends on the sort order. I guess Stata deletes the temporary variables in a specific order and manually dropping one of them (the middle one) somehow affects the view. Watch:
          Code:
          . clear all
          
          . sysuse auto
          (1978 automobile data)
          
          .
          . program define testprog
            1.     tempvar touse nonsense weight
            2.     gen byte `touse' = 1
            3.     gen byte `nonsense' = 0
            4.     gen `weight' = weight
            5.     mata: st_view(p = ., ., "`weight'", "`touse'")
            6.     /* omit original code */
          .     if `0' {
            7.         drop `weight'    // <- added
            8.         drop `nonsense'
            9.         drop `touse'     // <- added
           10.     }
           11. end
          
          .
          . testprog 0
          
          . // mata: mean(*pp)
          . mata : p[1::10]
                     1
               +--------+
             1 |  2930  |
             2 |  3350  |
             3 |  2640  |
             4 |  3250  |
             5 |  4080  |
             6 |  3670  |
             7 |  2230  |
             8 |  3280  |
             9 |  3880  |
            10 |  3400  |
               +--------+
          
          .
          . testprog 1
          
          . // mata: mean(*pp)
          . mata : p[1::10]
                     1
               +--------+
             1 |  2930  |
             2 |  3350  |
             3 |  2640  |
             4 |  3250  |
             5 |  4080  |
             6 |  3670  |
             7 |  2230  |
             8 |  3280  |
             9 |  3880  |
            10 |  3400  |
               +--------+
          
          .
          end of do-file

          Comment

          Working...
          X