Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by daniel klein View Post
    I am with Clyde here. What if NA represents ISO-3166 ALPHA-2 of Namibia? Do you think you would spot this using such an option? Unlikely. The same is true for NaN, by the way, and probably for many others, too.
    You can convert between alpha codes, numeric codes, and names for ISO 3166-1, ISO 4217, and ISO 639 using the pyconvertu command:
    https://www.statalist.org/forums/for...onvertu-in-ssc

    Comment


    • The behavior of explicit subscripting combined with if qualifier should be noted in Stata's help.

      Consider the following example:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(test condition)
      1 0
      2 0
      3 1
      4 1
      5 0
      end
      
      gen test2=test[1] if condition==1
      list test2

      The resulting test2 variable takes the first value of test (=1) in the data, not in the observations defined by the if condition (if that were the case it would be 3). I understand what happens behind the curtains, but the help entry for if qualifier is very plain and straightforward and doesn't mention such exceptions:
      "if at the end of a command means the command is to use only the data specified. if is allowed with most Stata commands.".
      In my example above the subscripting obviously goes beyond the scope defined by the if qualifier. Perhaps this is common knowledge among seasoned programmers, but in my opinion this behavior should be note more visibly due to its counterintuitive nature, perhaps as a technical note in either "if" and/or "subscripting" help entries.

      Comment


      • Originally posted by Evgeny Saburov View Post
        [...] the help entry for if qualifier is very plain and straightforward and doesn't mention such exceptions:
        "if at the end of a command means the command is to use only the data specified. if is allowed with most Stata commands.".
        There is no exception here. The command -- generate -- applies only to the data specified. The help entry is naturally quite about what the if qualifier does not do. It's impossible to get a complete list of that.

        Originally posted by Evgeny Saburov View Post
        [...] in my opinion this behavior should be note more visibly due to its counterintuitive nature, perhaps as a technical note in either "if" and/or "subscripting" help entries.
        There would certainly be no harm in the suggested addition.
        Last edited by daniel klein; 21 Nov 2024, 03:48.

        Comment


        • Hi Daniel.
          When I read the phrase "the command is to use only the data specified" my understanding and expectation (and I believe many others will understand it similarly) is for the command to use only those two observation where the condition is 1, to use my example. In other words, it's as if there's nothing else for the command to use besides those two observations, this is its scope and the command shouldn't even "see" anything else.

          Comment


          • I agree that there is some ambiguity here. [D] generate states that
            If you specify the if or in qualifier, the = exp is evaluated only for those observations that meet the specified condition or are in the specified range (or both, if both if and in are specified). The other observations of the new variable are set to missing
            This statement does not say the expression itself is evaluated subject to the restrictions implied by the specified condition or range. How could it be? An expression might be unrelated to the dataset. I do see the potential for misunderstandings.

            If you want subscripts to refer to subsets of the dataset, you need by. This is documented in [U] 13.7.2 Subscripting within groups, albeit not as explicitly as you ask for.

            Comment

            Working...
            X