Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find maximum values for a qfit

    Hi,

    I am doing a study about the ROA of real estate companies and their debt-to-equity ratio. I want to find a preferable debt ratio and have so far created a scatterplot and fitted a curve to it using:

    HTML Code:
    graph twoway (scatter ROA Kortfrist_sku) (qfit ROA Kortfrist_sku) if År == 2012
    I now want to find the interval for the debt ratio (x axis) where ROA is positive and also perhaps locate maximum values. How do I conduct this in Stata? Is it possible?

    Kind regards,
    Sophie Starck

  • #2
    With the quadratic equation \(y= ax^2 + bx + c\), \(\frac{-b}{2a}\) represents the maximum or minimum point. So one way would be to run a regression with a quadratic term on the RHS and determine this value.

    Code:
    webuse lbw, clear
    regress bwt c.age##c.age
    di -_b[age]/(2*_b[c.age#c.age])
    local min = `:di %3.2f `=-_b[age]/(2*_b[c.age#c.age])''
    set scheme s1mono
    tw qfit bwt age, xline(`min') xlab(10 20 `min' 30 40 50)
    Res.:

    Code:
    . regress bwt c.age##c.age
    
          Source |       SS           df       MS      Number of obs   =       189
    -------------+----------------------------------   F(2, 186)       =      3.89
           Model |   4007561.9         2  2003780.95   Prob > F        =    0.0222
        Residual |  95907736.7       186  515632.993   R-squared       =    0.0401
    -------------+----------------------------------   Adj R-squared   =    0.0298
           Total |  99915298.6       188  531464.354   Root MSE        =    718.08
    
    ------------------------------------------------------------------------------
             bwt | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |  -151.2227    66.3142    -2.28   0.024    -282.0474   -20.39809
                 |
     c.age#c.age |   3.253674   1.304625     2.49   0.014     .6799088    5.827439
                 |
           _cons |   4610.534   817.5263     5.64   0.000     2997.718     6223.35
    ------------------------------------------------------------------------------
    
    . 
    . di -_b[age]/(2*_b[c.age#c.age])
    23.238767
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	27.9 KB
ID:	1731364

    Comment


    • #3
      SImilarly solving the quadratic equation for y = 0 will give two solutions which should bound the interval or intervals in which y is positive

      Comment


      • #4
        Thank you for your responses. I tried running the code that Andrew wrote.

        HTML Code:
        reg ROA c.Kortfrist_sku##c.Kortfrist_sku if År == 2012
        di -_b[Kortfrist_sku]/(2*_b[c.Kortfrist_sku#c.Kortfrist_sku])
        local min = `:di %3.2f `=-_b[Kortfrist_sku]/(2*_b[c.Kortfrist_sku#c.Kortfrist_sku])''
        set scheme s1mono
        tw qfit ROA Kortfrist_sku, xline(`min') xlab(1 1.5 `min')
        But I don't get the same line for the minimum value. I am also not sure how the results from this should be interpreted. I have want to know which debt-to-equity ratio (the variable Kortfrist_sku) is associated with the highest ROA.

        Graph.gph

        About what Nick wrote, how do I conduct this in Stata?

        Kind regards,
        Sophie Starck

        Comment


        • #5
          In one command you are selecting on year being 2012 and in the other you are using all years.

          Otherwise we can't use your data to check what you're doing, but the code looks good apart from the problem just mentioned.

          However, you jump between talking about the minimum and the maximum. Be clear: Andrew Musau's technique finds the position of the turning point of the quadratic, where the slope is zero; whether it's a minimum or a maximum depends on which way up the curve is. Indeed, it's possible that the turning point does not appear within the range of the data.

          Here is some technique. I use y = 4 - x^2 + noise, where we know that y should reach a maximum near x = 0 and be 0 near x = -2 and x = 2.

          Code:
          clear 
          set obs 101
          set seed 2803
          range x -5 5 
          gen y = 4 - x^2 + rnormal(0, 1)
          
          gen xsq = x^2 
          
          regress y x xsq 
          scalar c = _b[_cons]
          scalar b = _b[x]
          scalar a = _b[xsq]
          
          scalar xmax = -b/(2 * a)
          scalar xzero_1 = (-b - sqrt(b^2 - 4*a*c)) / (2 * a)
          scalar xzero_2 = (-b + sqrt(b^2 - 4*a*c)) / (2 * a)
          
          scalar list 
          
          scatter y x || qfit y x, xline(`=xzero_1' `=xmax' `=xzero_2')
          Code:
          . scalar list 
             xzero_2 = -1.9714539
             xzero_1 =   1.922494
                xmax = -.02447996
                   a =  -.9759464
                   b = -.04778225
                   c =  3.6989425
          Click image for larger version

Name:	quadratic.png
Views:	1
Size:	37.9 KB
ID:	1731452

          Comment


          • #6
            Thank you for your answer. I only want to run this for one of the years (År == 2012), but how do I add this condition to the command?

            Also, the curve I want to locate the interval and maximum points is one found by curve fitting using qfit:

            HTML Code:
            graph twoway (scatter ROA Kortfrist_sku) (qfit ROA Kortfrist_sku) if År == 2012
            Can I somehow save this curve so I can put in the scalar commands for max and zero? I understand the procedure but am not really sure how to implement it on my data.
            Lastly, just to be sure, is the result from xzero_2 and xzero_1 the intervall I am searching for?

            Kind regards,
            Sophie Starck

            Comment


            • #7
              You have already used an if qualifier in your first command in #4 -- which is the answer to your first question.

              Sorry, but I don't understand otherwise what new questions you're asking. The point of #5 was to show exactly how to do what you want, get a graph using qfit and annotate it with the positions of the maximum (or minimum if applicable) and the values where the outcome is predicted to be zero.

              As said, I don't have access to your data and don't see that it's my job to try to invent data like yours or to use your variable names.

              In (British) English at least "if you don't use it, you lose it" is proverbial for forgetting stuff you once knew. I guess I learned about solving quadratic equations around 1966 and haven't often needed to do it since leaving secondary school. Indeed I made a silly mistake in using fhe formula before I posted #5. https://www.mathsisfun.com/quadratic...on-solver.html was what I used to refresh my memory.

              Comment


              • #8
                Hi Nick. I think you misunderstood me. This is the qfit I get: Graph1.gph

                And I want to locate the x values for where y=0. (It is hard to tell what these values are from the picture). I've tried to google but could not find an answer anywhere.
                Hope you understand what I am trying to say.

                Comment


                • #9
                  If I misunderstood you, sorry, but how did that happen? Clearly in writing #7 I could not foresee the graph you include in #8.

                  Please don't show .gph attachments here, but ,png, as explained in FAQ Advice #12.

                  Here it is any way:

                  Click image for larger version

Name:	Graph1.png
Views:	1
Size:	20.6 KB
ID:	1731496


                  The same place https://www.statalist.org/forums/help#stata explains how to post a data example. To make real progress, we need to see the data behind the graph above. To my eye there are at least three problems:

                  1. The notable outlier, which requires some economic or financial judgement that I can't supply.

                  2. Given that outlier, the quadratic fit looks unconvincing. As you are concerned with whether fitted values or predictions are positive, zero or negative, the outlier means that negative predictions are given even in zones where most values are positive.

                  3. It may be, by logic similar to #5, that one solution for y = 0 is beyond the range of the data and indeed occurs for an x value that is impossible to achieve. Kortfrist_sku was explained as debt to equity to ratio -- can it be negative even in principle?

                  Comment


                  • #10
                    Thank you for your feedback Nick. I really appreciate your help. I have done some research about how to fit a better curve to the data and eventually landed on a lowess curve. This is the command I am typing in:

                    HTML Code:
                    graph twoway (scatter ROA Kortfrist_sku) (lowess ROA Kortfrist_sku) if År == 2012
                    Is it possible to find the interval of the x variable (Kortfrist_sku) where the y variable (ROA) is positive? (this would mean a preferable debt-to-equity ratio)

                    * Example generated by -dataex-. For more info, type help dataex
                    clear
                    input double(ROA Kortfrist_sku)
                    .03667054714784633 .05562242260413173
                    .0665068903535051 .041256526577077804
                    .051943536610805925 .03560253111744663
                    .09421061480692709 .0527377683125035
                    .10617209141274238 .09478878116343491
                    .10884728340675477 .0855726872246696
                    .0939744731374295 .07776788364499852
                    .10238083378990885 .0558848433530906
                    .07959761485993737 .06563424992492814
                    .15698232808298118 .06849468561915738
                    .037838816110036054 .0569392045217246
                    .05252166147318314 .6090615515928166
                    .03777970092172342 .5824354812499764
                    .07222790885710108 .5930496784760336
                    .08110266333580082 .5390062581883367
                    .055407447530335094 .5319456669398974
                    .0849719382669845 .26309760433338725
                    .021832712966519313 .2569522041063991
                    .051046812514702423 .34568336861914845
                    .024194400422609615 .2528262017960909
                    .26066098081023453 .2260127931769723
                    -.050008682062858134 .26358742837298144
                    .06568191743561504 .10427350551863374
                    .04858618576556071 .08555108115206163
                    .008556022385140078 .08171596727278913
                    .096873986815087 .06811304441802658
                    .11716682873877744 .10709197403830932
                    .07337231158674236 .04316573943761748
                    .04050815262193121 .04146692071653932
                    .031689851066482026 .06831838022124696
                    .009233464923952517 .03175993230764452
                    .19350701170735757 .012024048049176611
                    -.05904592460802847 .020905148448793508
                    -.20033818058843422 .0447074737910044
                    -.20822418036040327 .05822814955941891
                    -.2591788694934426 .0561890495551955
                    .09558923474242219 .14144960402910225
                    .11383397238068518 .02280869104398516
                    .059332020404209984 .027113754393085446
                    .2211184242332157 .11597038530868205
                    .152951463639933 .14109920152449454
                    .07168861893128183 .14487230628266787
                    .07307996572802726 .07095750467372738
                    -.05651202499979458 .20102099075809718
                    .0402422944791213 .05780652731407558
                    .004141959179050747 .054932194294559096
                    .005323323050323284 .06805568777385268
                    .019669311758494318 .07056727125908442
                    .008995875103979967 .04724335171384713
                    .005578068602050533 .05333229994759258
                    .06807955330396274 .01521299695709553
                    .10409099666034914 .016405744885371602
                    .08210258446093258 .01679099507016464
                    .02621247945186146 .025340265289189284
                    .06499807764394759 .030627446862685533
                    .053257042253521125 .8461707746478874
                    .023273618128923597 .8738324911958353
                    .04213187276174426 .8467102029351872
                    .01915318111283529 .14740264538535433
                    -.01900279998650609 .2813311743075937
                    .03040755653890729 .1388106330250434
                    -.001157006316024817 .18685931473925438
                    .02394788576286391 .18876583314695805
                    -.039912050803641234 .40542934685137716
                    .04566559982974265 .14144456813568201
                    -.02323043018545139 .35671659855298016
                    .04047214227094778 .14339004747338702
                    .04130777516464594 .09722589758593829
                    .022038830971867356 .14782959187856098
                    .08712813642510142 .09671583112135022
                    .07245750114861761 .07378178968135997
                    .059841451722283286 .07101933914832917
                    .07651568870767594 .01721769189859954
                    .05497238651051663 .021248678077631117
                    0 .025879270381940947
                    .06920880537901815 .019010776457584967
                    .06710695170427869 .016808037810260103
                    .0312278570705537 .15917602996254682
                    .01040199371546213 .0850037923935421
                    .025496753885500687 .095455439700964
                    .019868102512730614 .4149762083646381
                    .04207178226768352 .2687317020276864
                    .02999786383472184 .09243492325063322
                    .030029334568473058 .26938912047758734
                    .03800864840891451 .08349040913626787
                    .025661523702933887 .15649532336717564
                    .050767609300199364 .4024964762352224
                    .03259084261446732 .14004949938125774
                    .04021184242854413 .02885534110452895
                    .04478786765670506 .029438774171542517
                    .03179479101029196 .026281243436252887
                    .06754665666322798 .024805401856888305
                    .06348882050234315 .02497669607855656
                    .07019304281345566 .02374808868501529
                    .08088777946602996 .019546342522248752
                    .05707935545789766 .026903066121129464
                    .05108446450016376 .02323592561592489
                    .066587101423167 .01955165735903441
                    .00973677328897074 .09151454117542328
                    .22346537352964088 .01844278258160006
                    end

                    Comment


                    • #11
                      That is a different dataset from that discussed in #8 and #9.

                      I have several comments on various levels, but I fear that they will fall far short of a good answer to your underlying broad question, which I take to be: how to model the relationship between these variables.

                      You're relatively new to Statalist and may not have picked up that I am not an economist and that I never work with your kind of data in any serious sense, so I can have no idea of what should best be done here.

                      The context of what level you are at and who will be evaluating your work (teacher? supervisor? journal reviewers?) is also important.

                      1. What happened to the major outlier?

                      2. If we look at the data example, the good news is that a quadratic fit and a lowess fit match quite well.

                      3. But watch out: A quadratic fit is a quadratic fit and should be easy to replicate in any comparable software. However, a lowess fit is just a lowess ft with a particular bandwidth (and there's no magic about the default) and (even more) the lowess algorithm in Stata is a little idiosyncratic and not guaranteed to match how lowess is implemented anywhere else. My own view is that lowess is no longer state of the art and has been superseded by the loosely similar (but much more general) idea of local polynomial regression. But with local polynomials, there may be more consistency across software, but you still have to choose the kernel and the degree and the bandwidth.

                      4. In this particular case, both predictions for the response ROA are positive over the entire range of the data, so I don't know where that leaves your specific question.

                      Code:
                      scatter ROA K, ms(Oh) || qfit ROA K || lowess ROA K, yli(0, lc(red))

                      Click image for larger version

Name:	ROA.png
Views:	1
Size:	38.2 KB
ID:	1731619

                      Comment


                      • #12
                        Thank you for your feedback. In now feel more confident about the fit of lowess curve. Yes, you are correct about the underlying broad question. I have since posting last tried several different combinations in Stata and feel the most confident about this:

                        *year 2013 / Kortfristiga skulder
                        // Generate graph with scatter-plot and lowess-curve
                        HTML Code:
                        graph twoway (scatter ROA Kortfrist_sku) (lowess ROA Kortfrist_sku) if År == 2013
                        // Use predict to estimate values on Kortfrist_sku based on lowess
                        HTML Code:
                        lowess Kortfrist_sku ROA if År == 2013, gen(fitted_Kortfrist_sku_2013)
                        // Show values of Kortfrist_sku when ROA is positive
                        HTML Code:
                        list fitted_Kortfrist_sku_2013 if ROA > 0 & fitted_Kortfrist_sku_2013 != .
                        // Show values of Kortfrist_sku when ROA is negative
                        HTML Code:
                        list fitted_Kortfrist_sku_2013 if ROA < 0 & fitted_Kortfrist_sku_2013 != .
                        // (if you want to compare scatterplot with fitted values)
                        HTML Code:
                        scatter ROA Kortfrist_sku if År == 2013
                        scatter ROA fitted_Kortfrist_sku_2013
                        Does this seem like an appropriate method to you?

                        Comment


                        • #13
                          I am bowing out now as I don't think I can help further. I meant what I said about being unqualified to advise on what is appropriate. There were questions in #11 but they were questions for you to think about, not implied promises that I act as mentor indefinitely. The most important words were

                          I am not an economist and ... I never work with your kind of data in any serious sense, so I can have no idea of what should best be done here
                          Vincent Hagblom started https://www.statalist.org/forums/for...-of-debt-ratio and has said that you're working together. If that is so, please shut down one of these threads and add a cross-reference.

                          Comment


                          • #14
                            Thank you for your time. I appreciate your feedback and now know how to move forward.
                            Have a nice day!

                            Comment

                            Working...
                            X