Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 'no, data in memory will be lost' warning after preserve

    Hello,
    I was trying to estimation abnormal stock returns using calendar time method, but after we select the portfolio and divided into quintiles and we run the regression we were unable to restore our previously preserved data using the restore command, due to this error.
    Do you know what this mean and how to work?

  • #2
    Well, this error is arising somewhere inside two nested loops. We can only see the inner most (-forvalues mofd = 13(1)144)-) in what you show. It is possible that the error message you are getting is being thrown by a command that is not shown there but is higher up.

    As an aside, the -forvalues mofd = 13(12)144- loop is very odd and probably is not doing what you want. I say that because in order to access the value of mofd that is running from 13 through 144, there needs to be a reference to `mofd' (emphasis on those quote marks) within the loop. You have several references to unquoted mofd, which must refer to a variable in your data set (or you would have gotten a syntax error about that). But that is not the same thing. I suspect that in your -keep if- statement below line 17, one or more of those mofd references should be `mofd', though since I don't know what you're actually trying to do, I can't say. What I can say confidently is that the loop from lines 16 through 26 as written, even if you get the data to -restore- would simply repeat the exact same calculations 132 times, it would not be doing different things each time through the loop.

    Finally, your screenshot was just barely legible on my screen. Many screenshots are not legible at all. Even when they are, opening the screenshot obscures the post itself, so that trying to refer back and forth between the screenshot and the post is really tedious and burdensome. That's why the FAQ asks people not to use screenshots. The best way to show results is to copy them directly from the Results window or the log file and then paste them into a code block in your post.

    To give you better advice, I think we need to see more of what happened: show us the code and results from the top of the loop that ends on line 27.

    Comment


    • #3
      The results are these:

      Code:
      *create portfolios
      . gen prtfcounter = 1
      
      . forvalues prtfcounter =1(1)4{
        2. if prtfcounter == 1{
        3. replace fut_ret = avg_ret_12
        4. }
        5. if prtfcounter == 2{
        6. replace fut_ret = avg_ret_24
        7. }
        8. if prtfcounter == 3{
        9. replace fut_ret = avg_ret_36
       10. }
       11. if prtfcounter == 4{
       12. replace fut_ret = avg_ret_48
       13. }
       14. replace prtfcounter = prtfcounter + 1
       15. preserve
       16. forvalues mofd = 13(1)144{
       17. 
      . keep if mofd - mofd_event < (12*prtfcounter) & mofd - mofd_event > 0
       18. 
      . xtile size_quant = size, nquantiles(5)
       19. xtile bm_quant = bm_fyrc, nquantiles(5)
       20. xtile momentum_quant = avg_prior_ret, nquantiles(5)
       21. 
      . gen size_count = 1
       22. gen value_count = 1
       23. gen momentum_count = 1
       24. 
      . statsby _a _se, by(size_quant bm_quant) saving("$dir"): regress fut_ret mktrf smb hml, rob
      > ust
       25. restore
       26. }
       27. }
      (272,219 real changes made)
      (272,219 real changes made)
      (213,070 observations deleted)
      no; data in memory would be lost
      r(4);
      
      end of do-file
      
      r(4);
      
      .

      Comment


      • #4
        Clyde was right. You declare "preserve" in your outer loop, but try to "restore" in your inner loop. Check out the syntax highlighting in the editor, and/or indent for loops and ifs.
        Last edited by ben earnhart; 22 Feb 2016, 18:52.

        Comment


        • #5
          Ben has pointed out the error you stumbled over. But there are others that you have not yet stumbled over--but your results will be nothing like you expect as a result of them.

          In the outer loop, that begins above line 1, inside the loop you refer to things like -if prtfcounter == 1...- The prftcounter in your -if- statements is the variable prtfcounter, and specifically the value of that variable in the first observation. It does not refer to the loop counter, which would be `prtfcounter' (with the quotes* around it). Consequently prtfcounter will always be 1 because your "variable" prtfcounter was set to 1. Consequently nothing inside the outer loop ever changes, and the same exact calculations are done 4 times. Tangentially, the "variable" prtfcounter is not needed at all for this--with correct code for the loop that variable would play no role at all. In general, if you create a variable whose values are the same for every observation, you should ask yourself if you are not going down the garden path--there are few situations where a variable that doesn't vary is of any use in Stata.

          As noted earlier, the inner loop is similarly in error and runs the exact same calculations 132 times for each iteration of the outer loop. Moreover, the -statsby- command is saving to the same file every single time. Or, rather, it's trying to. After the very first time, it will refuse and will give you an error message telling you that the file already exists. You could quash that by specifying the -replace- suboption: but then you will end up with only the results from the very last iteration of the 4X132! So the loop structure itself isn't really compatible with using -statsby- in this way. I think that rather than looping over values of mofd, you need to have that variable as part of the -by()- in the -statsby- command.

          I also note that your -statsby- command seeks to save _a from regression results, but there is no such thing. You probably meant _b.

          Right now I don't have time to rewrite all of this code, but try working on fixing these errors along the lines I have suggested, and if you get stuck, post back tomorrow showing what you tried and what Stata responded.


          *Don't forget that the two quotes are different. The first one is `, and is on the key to the left of the 1 key on a US keyboard. The second one is the ordinary single quote key, to the right of the semi-colon on the US keyboard. If you use 'mofd', you will get more error messages.

          Comment


          • #6
            Having some unexpected time due to a cancelled meeting, I reviewed the code in #3 to try to understand what you are trying to do, and to fix it up. Much of the code and its complexity are superfluous. It seems that the essence of what you want to do is this:

            1. You want to consider four time windows: 1, 2, 3, and 4 years (with your data monthly) starting with the month after an "event."
            2. Within each time window you want to cross-classify the data into quintiles of bmfyrc and size (variables already in your data).
            3. Within each of the 25 combinations of those quintiles, you want to regress average returns over the 1, 2, 3, or 4 year windows (average returns already having been calculated) against variables mktrf smb and hml (again, variables already in your data).
            4. You want to capture all of these regression coefficients and their standard errors in a destination file.

            If I have this right, I think the following code should do it:

            Code:
            local saving destination_file_name_here
            forvalues nyears = 1/4 {
                local nmonths = 12*`nyears'
                
                preserve
                // IDENTIFY OBSERVATIONS TO BE INCLUDED IN THIS ITERATION OF THE LOOP
                keep if inrange(mofd, mofd_event+1, movd_event+`nmonths')
                
                // CALCULATE QUINTILES USED TO DEFINE BY GROUPS FOR -statsby-
                xtile bm_quant = bmfyrc, nquantiles(5)
                xtile size_quant = size, nquantiles(5)
                
                //    DO THE 25 REGRESSIONS; STATSBY REPLACES DATA IN MEMORY
                statsby _b _se, by(size_quant bm_quant): ///
                    regress avg_ret_`nmonths' mktrf smb hml, robust
                
                // ADD A VARIABLE TO THE RESULTS IDENTIFYING WHICH ITERATION THESE
                // RESULTS CAME FROM
                gen nyears = `nyears'
                
                // IF DESTINATION FILE ALREADY EXISTS, APPEND RESULTS TO IT
                capture confirm file `"`saving'"'
                if c(rc) == 0 {
                    append using `"`saving'"'
                }
                save `"`saving'"', replace
                restore
            }
            Notes:

            1. I have not tested this, and I don't guarantee that it is free of typos.

            2. The loop index nyears carries out the functions that you were trying to get out of prtfcounter in your code.

            3. -statsby- is used without the -saving()- option. As a result it overwrites the data in memory (which would eventually be obliterated by -restore- anyway). We then add to it an indication of which value of `nyears' prevailed for the generation of these results, and then we append those results to the destination file first identified in local macro saving.

            4. Your original code created variables size_count, value_count, and momentum_count--but these variables were just set to 1, were never used in any calculations, and then were obliterated by the -restore-. Perhaps your code was a work in progress and you intended to add more calculations that made use of these variables, but I have simply omitted them altogether here.

            5. It is not safe programming style to use global macros such as "$dir" when a local macro will do. If there is any program in memory when you run your code that contains a global macro of the same name, then you have created a name clash and either yours will overwrite that one or that one will overwrite yours. The result can be baffling results that are extremely difficult to debug. Local macros, as used here, are much safer because they are accessible only within the same interactive session or do file (or, if you are running a do file in highlighted chunks, only within the same highlighted chunk) where they are defined. There is no possibility of interference by other programs even if they do have the same name. Global macros are never safe because you are not always aware of other programs that may be in memory while you are running your code. Consequently, they should only be used when some piece of information must be available at many levels of the program hierarchy and there is no feasible way to pass the information up and down the tree as arguments in command calls. And when they are used, it is wise to pick a name that is unlikely to have been picked by anybody else, so dir is a poor choice even when you need a global macro.



            Comment

            Working...
            X