Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by daniel klein View Post
    I am with Clyde here. What if NA represents ISO-3166 ALPHA-2 of Namibia? Do you think you would spot this using such an option? Unlikely. The same is true for NaN, by the way, and probably for many others, too.
    You can convert between alpha codes, numeric codes, and names for ISO 3166-1, ISO 4217, and ISO 639 using the pyconvertu command:
    https://www.statalist.org/forums/for...onvertu-in-ssc

    Comment


    • The behavior of explicit subscripting combined with if qualifier should be noted in Stata's help.

      Consider the following example:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(test condition)
      1 0
      2 0
      3 1
      4 1
      5 0
      end
      
      gen test2=test[1] if condition==1
      list test2

      The resulting test2 variable takes the first value of test (=1) in the data, not in the observations defined by the if condition (if that were the case it would be 3). I understand what happens behind the curtains, but the help entry for if qualifier is very plain and straightforward and doesn't mention such exceptions:
      "if at the end of a command means the command is to use only the data specified. if is allowed with most Stata commands.".
      In my example above the subscripting obviously goes beyond the scope defined by the if qualifier. Perhaps this is common knowledge among seasoned programmers, but in my opinion this behavior should be note more visibly due to its counterintuitive nature, perhaps as a technical note in either "if" and/or "subscripting" help entries.

      Comment


      • Originally posted by Evgeny Saburov View Post
        [...] the help entry for if qualifier is very plain and straightforward and doesn't mention such exceptions:
        "if at the end of a command means the command is to use only the data specified. if is allowed with most Stata commands.".
        There is no exception here. The command -- generate -- applies only to the data specified. The help entry is naturally quite about what the if qualifier does not do. It's impossible to get a complete list of that.

        Originally posted by Evgeny Saburov View Post
        [...] in my opinion this behavior should be note more visibly due to its counterintuitive nature, perhaps as a technical note in either "if" and/or "subscripting" help entries.
        There would certainly be no harm in the suggested addition.
        Last edited by daniel klein; 21 Nov 2024, 03:48.

        Comment


        • Hi Daniel.
          When I read the phrase "the command is to use only the data specified" my understanding and expectation (and I believe many others will understand it similarly) is for the command to use only those two observation where the condition is 1, to use my example. In other words, it's as if there's nothing else for the command to use besides those two observations, this is its scope and the command shouldn't even "see" anything else.

          Comment


          • I agree that there is some ambiguity here. [D] generate states that
            If you specify the if or in qualifier, the = exp is evaluated only for those observations that meet the specified condition or are in the specified range (or both, if both if and in are specified). The other observations of the new variable are set to missing
            This statement does not say the expression itself is evaluated subject to the restrictions implied by the specified condition or range. How could it be? An expression might be unrelated to the dataset. I do see the potential for misunderstandings.

            If you want subscripts to refer to subsets of the dataset, you need by. This is documented in [U] 13.7.2 Subscripting within groups, albeit not as explicitly as you ask for.

            Comment


            • For graph bar: I wish that I could add a pattern to a bar's fill to help distinguish bars, to help people who print out in black and white graphs that were designed to be seen in color. It looks like currently the only ways to differentiate bars are by color or by changing the thickness/pattern of the surrounding lines.

              I also wish the documentation provided an easy way to preview the different bstyle options for the most common schemes.

              Comment


              • Stata's -twoway- graphs easily handle multi-line titles, axis titles, etc. e.g.
                Code:
                scatter price mpg, t1("This is the t1 Title" "Shown in" "Three Lines")
                Code:
                scatter price mpg, xti("This is the x-axis Title" "Shown in Two Lines")
                It would sometimes be handy if axis labels could be shown in multiple lines but current capabilities don't allow this (unless I've missed something).

                For instance, I might want to show something like
                Code:
                scatter price mpg, xlab(10 21.3 "Mean=" "21.3" 40)
                but that does not work.

                Curiously that xlab specification produces gibberish but does not generate an error but this one does
                Code:
                . scatter price mpg, xlab(10 21.3 "L1" "L2" 40)
                invalid label specifier, :  10 21.3 "L1" "L2" 40:
                r(198);

                Comment


                • Originally posted by John Mullahy View Post
                  For instance, I might want to show something like
                  Code:
                  scatter price mpg, xlab(10 21.3 "Mean=" "21.3" 40)
                  but that does not work.
                  The label needs to be surrounded by compound quotes:

                  Code:
                  scatter price mpg, xlab(10 21.3 `""Mean=" "21.3""' 40)
                  ---------------------------------
                  Maarten L. Buis
                  University of Konstanz
                  Department of history and sociology
                  box 40
                  78457 Konstanz
                  Germany
                  http://www.maartenbuis.nl
                  ---------------------------------

                  Comment


                  • Thanks Maarten.

                    Maybe this is obvious to users other than myself, but if not then perhaps in v19 this could be documented in
                    Code:
                    help axis_label_options
                    Last edited by John Mullahy; 09 Dec 2024, 10:32.

                    Comment


                    • The ability to use the mlmv method for Full Information Maximum Likelihood (FIML) estimation in GSEM would be nice.

                      Comment


                      • #381

                        For graph bar: I wish that I could add a pattern to a bar's fill to help distinguish bars, to help people who print out in black and white graphs that were designed to be seen in color. It looks like currently the only ways to differentiate bars are by color or by changing the thickness/pattern of the surrounding lines.
                        I can't speak for StataCorp but this is a longstanding if intermittent request -- which I don't support myself, but that's not what this thread is for.

                        Often (although naturally not always) a good answer is to use something else instead. The forum includes many examples where bar charts would be better replaced by dot charts or line charts.

                        Comment


                        • #379

                          When I read the phrase "the command is to use only the data specified" my understanding and expectation (and I believe many others will understand it similarly) is for the command to use only those two observation where the condition is 1, to use my example. In other words, it's as if there's nothing else for the command to use besides those two observations, this is its scope and the command shouldn't even "see" anything else.
                          I don't recall seeing that interpretation before.

                          StataCorp tend not to document misinterpretations. After all, the main point of documentation is to say what you can and should do, not to widen the issue to misinterpretations. Where would that stop?

                          As one of many examples, people sometimes try to use egen functions outside egen. Should the help for egen document that you shouldn't try that?

                          Sometimes Stata Tips in the Stata Journal document common misunderstandings as pitfalls. As the Editor in charge of that series, I see that as an important role for Tips. The Speaking Stata columns and some other papers sometimes do that too.

                          If you can find posts on Statalist where people have misunderstood how subscripting works in that context, that would strengthen the case for a Tip.

                          I don't think subscripting is often misunderstood.
                          Last edited by Nick Cox; 11 Dec 2024, 05:11.

                          Comment


                          • Hi @Nick Cox.
                            I don't recall seeing that interpretation before.
                            Me neither, but I think it's a reasonable reading of what's written in the help for -if-.
                            Does the command "use only the data specified" in cases when both -if- qualifier and subscripting are used? Clearly not, it can use data subscripting refers to that is outside of scope defined by the -if- qualifier. To me that contradicts the very clear and strong statement in the help for the qualifier.

                            Of course I realize that what seems reasonable in Stata for people with little experience and those with decades of it is very different, but given that both -if- and subscripting are such basic and essential tools I thought that clarifying this behavior somewhere would help the former. Otherwise people might learn it the hard way by being bitten by it, like I did.

                            If you can find posts on Statalist where people have misunderstood how subscripting works in that context, that would strengthen the case for a Tip.
                            Well, in this very thread you can count me, daniel klein (to some degree, I guess) and someone who liked my post among those who feel that this quirk deserves some attention.
                            I don't know if it's the right way to go about it and whether something like that is even allowed here, but a fun way to do it would be via a quiz of sorts using my example case (or something similar), with the question along the lines of - "After reading the documentation for if qualifier and subscripting, given this data and after running that command - what values will the new variable take and why?"

                            Comment


                            • #379 #381 #388

                              Approaching this from a different direction, there is certainly evidence that some users want that interpretation, but do realise that it is not what Stata offers.

                              A command listsome was posted on Statalist on 10 April 2008 in

                              https://www.stata.com/statalist/arch.../msg00448.html

                              in response to a question from Malcolm Wardlaw in

                              https://www.stata.com/statalist/arch.../msg00438.html on the same day.

                              But that command was never documented or made public beyond Statalist.

                              Independently Robert Picard posted a listsome command on SSC that was first announced on 18 August 2014 in

                              https://www.statalist.org/forums/for...f-observations

                              The question arose in similar but not identical form in 2022, leading to another command listfirst on SSC:

                              https://www.statalist.org/forums/for...et-a-condition

                              https://www.statalist.org/forums/for...6832-listfirst

                              #379 stands as a request to StataCorp. I am not at present minded to write further on this myself, but that's all it says and nothing else.

                              Comment


                              • Markov Switching ARDL

                                Comment

                                Working...
                                X