Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adornment in macro lists

    Dear All,

    just wanted to share my experience (and, perhaps, confirm that my conclusions are correct).

    Consider running the following code (example for this discussion, the production code is more complex, but that complexity is not relevant to the below):

    Code:
    do "http://www.radyakin.org/statalist/2024/2024-01-12_list_in.do"
    which is exactly:
    Code:
    local countries = `""Botswana" "Comoros" "Eswatini" "Ethiopia" "Kenya" "Lesotho" "Madagascar" "Malawi" "Mauritius" "Mozambique" "Namibia" "Rwanda" "South Africa" "South Sudan" "Sudan" "São Tomé and Principe" "Tanzania" "Uganda" "Zambia" "Zimbabwe""'
    
    foreach c in `countries' {
        local res : list c in countries
        display as text `"`c'"' _col(30) `"--->  "'  as result `"`res'"'
    }
    
    // END OF FILE
    I expected that I will get a positive confirmation for every element of the list, over which I am iterating, but I've got surprising results:

    Code:
    Botswana                     --->  1
    Comoros                      --->  1
    Eswatini                     --->  1
    Ethiopia                     --->  1
    Kenya                        --->  1
    Lesotho                      --->  1
    Madagascar                   --->  1
    Malawi                       --->  1
    Mauritius                    --->  1
    Mozambique                   --->  1
    Namibia                      --->  1
    Rwanda                       --->  1
    South Africa                 --->  0
    South Sudan                  --->  0
    Sudan                        --->  1
    São Tomé and Principe        --->  0
    Tanzania                     --->  1
    Uganda                       --->  1
    Zambia                       --->  1
    Zimbabwe                     --->  1
    Noticing that the problem occurs for every multi-word country name, I have checked against the documentation , but it confirmed that adorned values are acceptable as elements of the list and will be treated as a single value for all list-manipulation routines:

    A list is a space-separated set of elements listed one after the other. The individual elements may be enclosed in quotes, and elements containing spaces obviously must be enclosed in quotes.
    It appears that in this case the straightforward way is not correct, since -foreach- will remove the adornment present in the list, and subsequent probing will see non-adorned values (in local c), which will be interpreted as a list again, and for the case of multi-word items as list of so many single word items, resulting in incorrect (unintended) evaluation. This was somewhat non-obvious and could have been easily overlooked (if not the attention of my colleague, who was triple-checking the results).

    It is all clear now, and the following version of the code produces the correct results with an extra step added:

    Code:
    do "http://www.radyakin.org/statalist/2024/2024-01-12_list_in_works.do"
    which is exactly:
    Code:
    local countries = `""Botswana" "Comoros" "Eswatini" "Ethiopia" "Kenya" "Lesotho" "Madagascar" "Malawi" "Mauritius" "Mozambique" "Namibia" "Rwanda" "South Africa" "South Sudan" "Sudan" "São Tomé and Principe" "Tanzania" "Uganda" "Zambia" "Zimbabwe""'
    
    foreach c in `countries' {
        local cc `""`c'""'
        local res : list cc in countries
        display as text `"`c'"' _col(30) `"--->  "'  as result `"`res'"'
    }
    
    // END OF FILE
    I believe there is no way to avoid this extra step in this case (the green line), e.g. by somehow reformulating the line that follows (through options, parentheses, or similar elements), but if there is one, please, do let me know.


    Best regards and Happy New Year!

    Sergiy Radyakin



  • #2
    I can confirm your experience, and at least so far I have not of a simpler workaround.

    But I don't think that this behavior is unexpected. I think that notwithstanding your careful attention to the definition of a list, you need to focus on the explanation of how the macrolist -in- function works. From the help file:
    [quote]
    A in B returns 0 or 1; it returns 1 if all elements of A are found in B. If A is empty, in returns 1. Otherwise, 0 is returned.
    [/code]
    So, if A is c and B is countries, what happens if "`c'" == "South Africa". -in- does not purport to check whether "South Africa" is an element of `countries'. Rather, it purports to check whether "South" and "Africa", which are the elements of `c', appear as elements of `countries', and it correctly reports that they are not.

    That said, the whole phenomenon of adornment in macros is complicated in Stata, and I know I get tripped up by things like this frequently.

    Comment


    • #3
      Hello Clyde Schechter and thank you for commenting. I think the element of surprise came from a simple English interpretation that if I am iterating over a list, then every current item should belong to that list, which should be confirmed by a question `:list elementmacro in listmacro', but of course Stata doesn't have that knowledge that elementmacro is one element (because it was assigned by foreach), but rather takes it as a list, with the observed result.

      So in the end, Stata is working absolutely correctly. (But one only starts being careful after knowing there is a subtle problem with that straightforward approach.)

      If anything, I'd probably prefer that all elements of the list were enclosed in quotes, to make it explicit what is being processed, but it may cause compatibility and convenience issues throughout the package, so clearly won't be done for just this reason.

      Have a great weekend, Sergiy

      Comment

      Working...
      X