Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Winsorize

    This is my first time using stata, so I'm definitely struggling. I am looking for how to winsorize z-scores?

    Winsorize Z score at the 1 and 99th percentiles (use “findit winsor” to download the package, p(.01) will winsorize at the 1st and 99th percentiles)

    Additional deliverables:
    5. Find the mean, median, min and max winsorized Z-Score for the whole sample.
    6. Find the mean, median, min, and max winsorized Z-Score by SIC industry.
    7. Find the mean, median, min, and max winsorized Z-Score by year.

  • #2
    See this thread here:

    https://www.statalist.org/forums/for...-leve-1-and-99

    Comment


    • #3
      Originally posted by Joro Kolev View Post
      I looked at this thread, but I'm not sure what variable I would do it by. For example right now I have: winsor zscore, gen(zscore1) h(1)

      ignoring that and looking at the other thread i do this: winsor zscore, cuts(1 99) by(sic)

      it tells me i am still missing a generate file, and winsor2 is not a recognized command.

      Comment


      • #4
        I do not have any of these two installed, but from reading the help file, you need -winsor2-. According to the help file, -winsor- is not byable.

        So you need something like

        Code:
        winsor2 zscore, cuts(1 99) by(sic)
        or by any other variable such as year that you want to winsorise.

        Originally posted by katrina wilson View Post

        I looked at this thread, but I'm not sure what variable I would do it by. For example right now I have: winsor zscore, gen(zscore1) h(1)

        ignoring that and looking at the other thread i do this: winsor zscore, cuts(1 99) by(sic)

        it tells me i am still missing a generate file, and winsor2 is not a recognized command.

        Comment


        • #5
          Originally posted by Joro Kolev View Post
          I do not have any of these two installed, but from reading the help file, you need -winsor2-. According to the help file, -winsor- is not byable.

          So you need something like

          Code:
          winsor2 zscore, cuts(1 99) by(sic)
          or by any other variable such as year that you want to winsorise.


          winsor zscore, gen(newz) p(.01)

          is what I currently have since my professor indicated that I needed to include the p(0.1) in the assignment. but now i'm struggling on how to get the statistical data by variables. any insight on that? for example the 5-7. once i know how to do 6, i can do 5 and 7.

          Comment


          • #6
            now i'm struggling on how to get the statistical data by variables. any insight on that? for example the 5-7. once i know how to do 6, i can do 5 and 7.
            Check out this handy command called -tabstat-

            Code:
            help tabstat

            Comment


            • #7
              Originally posted by Ken Chui View Post

              Check out this handy command called -tabstat-

              Code:
              help tabstat
              i've tried a MILLION different ways of doing this including tab, summarize, etc... and nothing is giving me what i need D:

              Comment


              • #8
                Originally posted by Ken Chui View Post

                Check out this handy command called -tabstat-

                Code:
                help tabstat
                also sysuse auto won't work for me, everytime i do it it says no; data in memory would be lost

                Comment


                • #9
                  You got confused advice from whoever is responsible for the 1 2 3 you don't show and the 4 5 6 you do show.

                  winsor is from SSC. I wrote it back in the day. It doesn't support groupwise calculations, so you could only use it to get separate Winsorized results for subsets of observations by writing a loop or using it repeatedly, neither of which you may want to get into right now.

                  winsor2 is also from SSC but a different command that you must install separately.

                  An important detail as earlier flagged in a thread cited above is that the latter command will support groupwise calculations. Also, the two commands don't have identical syntax, so the syntax of one won't work necessarily with the other.

                  I have to say that the mention of
                  tabstat at best points to a way to tabulate the percentiles which are worth knowing but that command is not a way to Winsorize.

                  Comment


                  • #10
                    Originally posted by Nick Cox View Post
                    You got confused advice from whoever is responsible for the 1 2 3 you don't show and the 4 5 6 you do show.

                    winsor is from SSC. I wrote it back in the day. It doesn't support groupwise calculations, so you could only use it to get separate Winsorized results for subsets of observations by writing a loop or using it repeatedly, neither of which you may want to get into right now.

                    winsor2 is also from SSC but a different command that you must install separately.

                    An important detail as earlier flagged in a thread cited above is that the latter command will support groupwise calculations. Also, the two commands don't have identical syntax, so the syntax of one won't work necessarily with the other.

                    I have to say that the mention of
                    tabstat at best points to a way to tabulate the percentiles which are worth knowing but that command is not a way to Winsorize.
                    this one had seemed to work for me: winsor zscore, gen(newz) p(.01)

                    is that not correct?

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      You got confused advice from whoever is responsible for the 1 2 3 you don't show and the 4 5 6 you do show.

                      winsor is from SSC. I wrote it back in the day. It doesn't support groupwise calculations, so you could only use it to get separate Winsorized results for subsets of observations by writing a loop or using it repeatedly, neither of which you may want to get into right now.

                      winsor2 is also from SSC but a different command that you must install separately.

                      An important detail as earlier flagged in a thread cited above is that the latter command will support groupwise calculations. Also, the two commands don't have identical syntax, so the syntax of one won't work necessarily with the other.

                      I have to say that the mention of
                      tabstat at best points to a way to tabulate the percentiles which are worth knowing but that command is not a way to Winsorize.
                      Here is the entire directions:
                      The model is calculated as follows: 𝑍−𝑆𝑐𝑜𝑟𝑒= .012𝑋ଵ+.014𝑋ଶ+.033𝑋ଷ+.006𝑋ସ+.999𝑋ହWhere: X1 = Working capital (WCAP) / Total assets (AT) X2 = Retained earnings (RE) / Total assets (AT) X3 = Earnings before interest and taxes (EBIT) / Total assets (AT) X4 = Market value of equity (MV) / Book value of total debt (LT) X5 = Sales (SALE) / Total assets (AT) Deliverable 1. Download and open Bankrupt.dta from Canvas. 2. Calculate the Altman Z-Score for every observation using the equation and variables listed above. Note that I’ve put the names of the variable in the dataset next to the full description above. 3. Drop all firms with the SIC Codes of 6000-6999 and 4800-4999. 4. Winsorize Z score at the 1 and 99th percentiles (use “findit winsor” to download the package, p(.01) will winsorize at the 1st and 99th percentiles) 5. Find the mean, median, min and max winsorized Z-Score for the whole sample. 6. Find the mean, median, min, and max winsorized Z-Score by SIC industry. 7. Find the mean, median, min, and max winsorized Z-Score by year. 8. Using the mean winsorized z score by year, graph this using a customized line graph (at least three options written in the Stata code, not through a graph editor). With so many options to choose from, your graph shouldn’t be like any others turned in. 9. Create a word document with your name at the top. Put the descriptive statistics from steps 5, 6, and 7 in the Word document. Skip a line and then copy your Stata code into the Word document. Skip a line and then copy your graph into the Word document. Save the document and upload it through Canvas.

                      here is the code I have thus far

                      cd "C:\Users\katri\OneDrive\Documents\Stata"

                      use bankrupt-1, clear

                      gen x1 = wcap/at

                      gen x2 = re/at

                      gen x3 = ebit/at

                      gen x4 = mv/lt

                      gen x5 = sale/at

                      gen zscore = (.012*x1)+(.014*x2)+(.033*x3)+(.006*x4)+(.999*x5)

                      drop if sic>=6000 & sic<=6999 | sic>=4800 & sic<=4999

                      findit winsow

                      winsor zscore, gen(newz) p(.01)

                      bysort year: sum newz

                      Comment


                      • #12
                        Katrina, you are asking for advice, then we tell you what to do, but you are not doing it, and then you keep on asking for advice... This is a non-convergent process, evidenced by the fact that you are asking for something trivial, and yet despite the question being trivial we are on post #12 in this thread.

                        Again: there are two different user contributed commands that deal with winsorising. -winsor- by Nick Cox is not byable, that is, cannot be executed on subsets of the data given e.g., by(sic). This would not be a problem for somebody who knows how to write loops in Stata, but you clearly do not know how to do that. Therefore you need to abandon the -winsor- command.

                        Install the other user written command -winsor2-, and do what you need to do with it.

                        In particular

                        Code:
                        findit winsor2
                        from within Stata and follow the instructions to install.

                        Then -winsor2- accepts syntax such as

                        Code:
                         
                         winsor2 zscore, cuts(1 99) by(sic) #
                        which calculates a variable winsorised at the 1st and 99th percentile, where the operation is performed by sic industry.

                        And this answers your question.


                        Comment


                        • #13
                          We have to add that as seemed implied by the format this is clearly an assignment and Statalist policy is not to support assignments. Please see FAQ Advice Extras.

                          That said, your teacher(s) seem to have assumed that winsor would suffice, which isn't the case.

                          Comment


                          • #14
                            Cross-posted at https://www.reddit.com/r/stata/comme...rize_function/

                            Comment

                            Working...
                            X