Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -domin- (SSC): computing time /max. number of independent variables

    Hi everybody,

    I am using Stata 16.0 and I am trying to perform dominance analysis using the -domin- command from SSC in combination with the -mixdom- module.

    My code is the following:
    Code:
    domin depvar indepvar1-indepvar21, reg(mixdom, id(orgunit)) fitstat(e(r2_w)).
    Stata has to estimate a total of 2097151 regressions and is running for over 24 hours now. There is no apparent progress so far. Does anybody have experience with how long it might take or if I should stop the calculation and adjust my code? Before, I tried to run the command with even more independent variables and got the error
    3900 unable to allocate real <tmp>[35,34359738368]
    I -compress-ed my data before and already reduced the number of variables by averaging related items. However, since the primary goal of my research is to analyze the relative importance of the various predictors that have been found to determine my dependent variable I won't be able to considerably reduce the number of included variables.

    I am completely at a loss because I will have to present first results very soon and I am not experienced with Stata working memory issues.

    Any suggestions are very welcome!

    Thanks a lot in advance,
    Kai


  • #2
    Hi Kai,

    For 2,097,151 mixed effect regressions, another "dot" will not appear on the progress meter until 5% or ~104,858 individual regressions are complete. Assuming they take around 30 seconds each to complete, the first dot will not appear for ~ 874 hrs. Thus, Stata is working but is doing so sequentially, model-to-model, which is not fast.

    The best way forward for problems with many independent variables is to use the epsilon option which does not require estimating all subsets but there is no mixdom wrapper for epsilon-based version.

    There is no good way around the size problem without parallelizing the computations - would like to extend domin's capabilities to parallelize them someday for users with multiple cores (or even on a computer cluster).

    This response may be too late to correct course now, but I would ask that you reconsider how important incorporating the mixdom component is to the findings. A model like this could be easily estimated using linear regression with the epsilon option.

    - joe

    Joseph Nicholas Luchman, Ph.D., PStatĀ® (American Statistical Association)
    ----
    Research Fellow
    Fors Marsh

    ----
    Version 18.0 MP

    Comment

    Working...
    X