Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "insufficient memory" error

    Hi guys,

    When I run an OLS regression I have the following error: "insufficient memory r(950)". The regression has 15.000.000 obs and even if I'm including 2 independent variables, one of those variables has around 50.000 possible values. I have Stata MP16 (64 bit), my laptop has 16gb and the available physical memory is 3.81gb. Any idea how can I waive this problem?

    Thanks

  • #2
    Perhaps this question is related to the discussion on your previous topic

    https://www.statalist.org/forums/for...fferent-movies

    If you have 50,000 possible values of a categorical variable that you want to treat as a fixed effect, then perhaps some command other than regress will save you from trying to deal with 50,000 fixed effects. This isn't a problem I have, but from previous topics on Statalist I gather that the areg command may help here.

    Comment


    • #3
      Thanks William, useful answer. I replied more extensively in the topic that you quoted!

      Comment


      • #4
        You should post, using appropriate formatting (the # sign from the formatting menu), exactly what you typed at Stata and exactly what Stata returned.

        At the surface of what you are saying, what you are saying could not happen.

        Comment


        • #5
          Hi, thanks for the answer. I don't get what do you mean with "what you are saying could not happen".

          The code I'm running is

          Code:
          reg score i.client_country i.gender i.item_booked, vce(cluster item_booked)
          and indeed when I run that the output is "insuficient memory". If instead of that I run:

          Code:
          areg score i.client_country i.gender, absorb(item_booked)
          it does works. Just to give more information, scores goes from 1 to 10, item_booked are more than 20,000 movies and client_country is the country of the person that is ranking the movie.

          Comment


          • #6
            You do not have " 2 independent variables" as you claimed in your original post, but rather the number of levels in client_country minus 1, plus the number of levels in gender minus one, plus the number of levels in item_booked minus 1. What you are using are factor variables and the i. operators expand your categorical variable in a set of dummies.

            If you have high dimensional fixed effect, expanding the categorical variable through an i. operator results in too many regressors, and can hit Stata limits pretty fast.

            The way to deal with this issue is the way how you have done it, through absorbing the highest dimensional fixed effect, and using -areg-.

            So is there still a question here, or you are reporting on the great success?

            Originally posted by Jean Jacques View Post
            Hi, thanks for the answer. I don't get what do you mean with "what you are saying could not happen".

            The code I'm running is

            Code:
            reg score i.client_country i.gender i.item_booked, vce(cluster item_booked)
            and indeed when I run that the output is "insuficient memory". If instead of that I run:

            Code:
            areg score i.client_country i.gender, absorb(item_booked)
            it does works. Just to give more information, scores goes from 1 to 10, item_booked are more than 20,000 movies and client_country is the country of the person that is ranking the movie.

            Comment

            Working...
            X