Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • OLS regression

    Hello everyone

    I am new in stata and ı faced a simple problem. I made a total of 672 regressions with my variables I used every possible combination but now I want to see the results that the p-values are smaller than 0,05. Is there any code that can do the check for me or do I have to check each result manually by myself?

  • #2
    Welcome to Statalist.

    Let's see your real code. Show me the code (using code delimiters) for the first 5 regressions you estimated. I don't know what
    I made a total of 672 regressions with my variables I used every possible combination
    means. Do you mean to tell me you literally estimated 672 regressions? What you seek is likely possible, but it seems like a bad idea from where I'm sitting, so could you please give me some context on what you're doing and why?

    Comment


    • #3
      Ok so here is the code

      Code:
      regress    interestrate    inf_diff_quarterly    GDP1    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP2    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP3    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP4    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP5    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP6    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    GDP7    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    gdp2015100Meur    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    gdp2015100index    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    gdpcurrent    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    gdpccurrent    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    manufacturing    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    construction    moneysupply    if    ID==5    
      regress    interestrate    inf_diff_quarterly    industry    moneysupply    if    ID==5
      it goes like that in short it is like that
      Code:
      regress interestrrate (4 diffrent variable) (21 different variable) (1 variable) (1 variable) (cons or nocons)
      Since I am new I wrote all the possible combinations on a do file and executed all of it so as a result, I got 672 regression. The First 2 variable sets are a must and the other two can be added or not it depends on the result. Anyway, I am trying to find the best combination for a model so I can understand which data explains the model better. I want to check all the results p value and only get the ones with p value < 0,05 so I can study on them. I hope ı could explain my problem clearly and gave you what you asked for.

      Comment


      • #4
        You explained the issue well, and I guess I could try and get a solution, but I guess my real point here is that this is extreme. What's the benefit of doing this at all? Why not run 20 or 40 regressions, is 672 really necessary? What're you getting from all these, and more importantly, what will you even do once you've estimated them all, put them in a table or a graph?Before I try and fix this, I really am curious and want to know the thought process as to why this is desirable to begin with. You say that the goal is to understand the dataset better, but we have principal components analysis for this purpose.

        Oh, and to save yourself from going mad, this cleans up the code
        Code:
        forv i = 1/7 {
        
        reg interestrate inf_diff_quarterly GDP`i' moneysupply if ID==5
        
        }
        
        foreach v of var ///
        gdp2015100Meur gdp2015100index ///
        gdpcurrent gdpccurrent ///
        manufacturing construction ///
        industry {
        
            reg interestrate ///
                inf_diff_quarterly `v' moneysupply ///
                if ID==5
        
        }
        My honest thoughts on this, from one researcher to another, is to save yourself the labor. There's no need at all to estimate more than 100 models unless this is for an appendix or Monte Carlo simulation.

        Comment


        • #5
          Thank you so much for the help. Also thanks for the advice my point was actually understanding which gdp or inflation parameter is more related with the interest rate after finding that i was planning to continue but there were too many options and i was stuck i thought the best option would be getting all the possible regressions and see the best fit but now I see that was pointless. Thank you again for the help

          Comment

          Working...
          X