Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with simple loop, Stata beginner

    Hello,

    I am using panel data with individuals that are identified by an identification number which is listed under the variable "Pidm", and the fiscal years in which they made donations. There are multiple rows per unique Pidm and per year they donated.
    • Eg. If an individual donated in fiscal years 2017 and 2018, they will have two rows, both with the same Pidm, but with the two donation amounts.
    I created a group variable which assigns a number to each unique Pidm.

    I am trying to have Stata go through all gifts for each unique Pidm, as defined by group numbers, and then copy the largest gift they ever made to a new variable called "Largest_Gift". There are over 400,000 unique Pidms.

    I normally use the levelsof command to run loops on each value of a variable, but when I try to do levelsof group I get this error message:

    Click image for larger version

Name:	levelsof.PNG
Views:	1
Size:	2.6 KB
ID:	1452760

    Example of dataset)
    dataex PPidm pidm_group FISCALYEAR FiscalYearGifts, count(10)

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long PPidm float(pidm_group FISCALYEAR) double FiscalYearGifts
    70004534 1    .    .
    70004536 2    .    .
    70004537 3    .    .
    70004538 4    .    .
    70004539 5 2002 1965
    70004539 5 2003  125
    70004542 6    .    .
    70004544 7    .    .
    70004546 8 2007   88
    70004546 8 2010   45
    end
    format %ty FISCALYEAR

  • #2
    The error message you are getting arises because apparently you have a very large number of values of pidm_group, so many that they cannot all be stored in a macro. (And, in fact, the -levelsof- command works just fine in this small example data set.)

    Fortunately, you do not need any loop at all to do this in Stata:

    Code:
    by PPidm, sort: egen largest_gift = max(FiscalYearGifts)
    The use of -by- instead of explicit looping is one of Stata's most important and distinctive features. The -egen- command is associated with a large range of functions, most of which can be used with -by:- prefixing in this way. Do read -help egen-. You will simplify many of your programs and save yourself a lot of trouble.

    Added: By the way, you are, in a sense, lucky that you encountered this error condition. Had the number of PPidm values been just small enough that they all fit in a local macro, your code would have run in a loop, but it would have taken an order of magnitude more time than doing it with -by:- and -egen-. -by:- is not only easier than looping, it is much faster.
    Last edited by Clyde Schechter; 10 Jul 2018, 14:41.

    Comment


    • #3
      Thank you again Clyde for the speedy and helpful response! Worked like a charm.

      I also wanted to ask your recommendation on how to become a more skilled statistician and Stata user, having noticed your proficiency and familiarity from the depth of your posts. At the present time, my only training is through two very basic data science classes I took in college which only covered the surface of Stata. I would like to make more advanced models and predictions, rather than mostly basic summary statistics and charts. Do you recommend a masters program, self-teaching through literature, or possibly another path?

      Thank you for all your help.

      Comment


      • #4
        I'm not sure what to recommend. Most my statistical knowledge is actually self-taught. I was a mathematician originally. When I later switched careers and became an epidemiologist, I took introductory statistics in my epidemiology fellowship. But the rest I learned by reading books and journal articles, and picking the brains of my statistics PhD colleagues. But my mathematical background made that much simpler than it would be for others. I also had a strong background in computer programming dating back to 1962 that made it easier for me to learn to code in Stata (which bears a strong resemblance to C, though it is easy to get caught up on the differences at first.) I think that a master's program in statistics at a university can be a very efficient way to become proficient in applied statistics. The tuition may be offputting,or one might regard it as an investment. I would also note that many Unversity statistics programs train their students in R. You might come to prefer R to Stata. If you migrate back to Stata, there will be another learning period for that, but see the next paragraph.

        As for becoming proficient in Stata. The very first step is to read the Getting Started [GS] and User's Guide [G] volumes of the PDF documentation that accompanies your Stata installtion. (Select PDF Documentation from the Help drop-down menu in Stata; this will open your PDF reader to the overall table of contexts. Click on the links from there.) Those are well written, include worked exmples, and cover the basics. After that is done, there are many resources available. There are a number of books written for the express purpose of learning Stata coding, available from the Bookstore at www.stata.com. The nice thing about these books is that they are written with focus on different disciplines. I'm particularly fond of Chris Baum's book, even though it draws a preponderance of its examples from economics/finance, which is not my area. But there are many available and you can probably find one that is at your level and focused on the kind of problems you will work on.

        That said, not everybody learns well by reading books. Another resource is the You Tube videos released by StataCorp. These tend to be focused on particular commands and, to my dismay, they often center around the GUI instead of coding and the command line. Nevertheless they are helpful.

        One resource that I found extremely helpful when I was getting started in Stata is StataCorp's net courses. These are courses given on line, run by StataCorp employees. They can take you directly from beginner to advanced over the course of a few months. They are reasonably priced as well.

        Hope this helps.

        Comment


        • #5
          Perhaps one should mention to Michael that I suspect many of the gurus (a number come to mind: Nick, Clyde, Carlo, William, Phil, Richard, Steven, Joao, Attaullah - and several others including Stata's technical team) have spent many^2 hours writing and testing code - and providing guidance to those of us who benefit substantially. Incidentally there is a young lady who's name I just cannot recall, who is also a shining light. And I take this opportunity to say thanks ladies and gentlemen: I hope you are aware of how much help you are and I am personally most grateful for your efforts and time.
          Best
          Laurence

          Comment


          • #6
            I can think of two women who are members of Statalist and could be considered "shining lights." One, Carole J. Wilson, burned very brightly for a brief period of time, but has not been active here recently. The other, Romalpa Akzo is a more recent member and contributes perhaps less often than the others mentioned in #5, but when she does, her contributions are top-flight and sometimes very creative. I would also add to the list of "gurus" Robert Picard, Joseph Coveney, Maarten Buis, and Weiwen Ng.

            The risk of going down this road, of course, is that there are probably others who deserve mention whose names have not yet popped into my or Laurence's head. So if others have nominations, do chime in.

            Comment


            • #7
              For example, how could I forget Rich Goldstein?!?

              Comment


              • #8
                Clyde sensei, please kindly receive my appreciation for your nice words. Honestly, I do feel slightly ashamed since I am too far below of such remarks, then I would take it as a sweet encouragement for my learning process.

                Actually, all of my suggestions are not something creative, but just the "re-quotations" of wonderful lessons of brilliant masters that I have learnt, somewhere within this Forum. Then I would take this post to express my admire and sincere thanks to those “gurus”, especially my top favorites Nick Cox and you Clyde Schechter, whom I have considered as "Sensei" with the full admiration attached with the meaning of this word. Among many admirable names mentioned by Clyde, I would like to note that I am also a fan of Weiwen Ng, whose posts are found by me as very informative and also ...very "musical".

                Learning and being able to have a small contribution through doing hobbies of "puzzle", in fact, is enjoyable. Then understanding that the suggestion is helpful to someone, is more than comfortable. And well, I suddenly notice that: being thanked is really likable, but saying thanks is even better. And I have just done it with all of my due respect and appreciation.

                Comment


                • #9
                  Thanks on my behalf -- and if it's appropriate, on behalf of others mentioned -- for all kind and appreciative words, which in turn are much appreciated as reflections of gratitude and esteem.

                  Beyond that. this isn't a sport in which results allow a clear ranking of the most skilled. Many people contribute in different ways. including (and they know who they are) those who take particular trouble to welcome newcomers in a warm and friendly way (something I don't deprecate, I just don't do it!) and those who make a point of trying to advise those who aren't (yet) on top of the small skills of how to ask good questions.

                  Comment


                  • #10
                    A bit late to the party. I am deeply honored to be mentioned in the same posts as some of the people I've learned a lot from.

                    And I would like to move to add Richard Williams' name to the list.
                    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                    Comment


                    • #11
                      I'm pretty sure that Richard Williams is the Richard referred to in #5.

                      Comment

                      Working...
                      X