Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Richard Rovin there is a link to some power functions in R from Blume on this page: https://www.statisticalevidence.com/...ation-p-values

    There was an American Statistician article discussing going beyond p < 0.05 (https://doi.org/10.1080/00031305.2019.1583913), and of the many suggestions, I thought Blume's construction was useful and practical, so I am looking at this more.

    Comment


    • #17
      Thank you, much appreciated

      Comment


      • #18
        I looked at the R-code and think that it is rather straightforward to translate the R-code into Stata-code. Would there be any interest to have second generation p-values available in Stata? Or is somebody already writing the necessary codes? I am just asking, because the translation of the code looks for me like a good coding exercise, so I would try to do something within the next few weeks.

        Comment


        • #19
          That would be amazing. Far beyond my skill set. Many thanks. Richard.

          Comment


          • #20
            Sven-Kristjan Bormann That would be of interest, to me too. You might contact Blume before you begin, because I know besides R he has used Stata.

            Here are references:

            •Blume JD, D'Agostino McGowan L, Dupont WD, Greevy RA, Jr. (2018) Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299
            •Jeffrey D. Blume, Robert A. Greevy, Valerie F. Welty, Jeffrey R. Smith & William D. Dupont (2019) An Introduction to Second-Generation p-Values, The American Statistician, 73:sup1, 157-167, DOI: 10.1080/00031305.2018.1537893
            •Stewart TG and Blume JD (2019) Second-Generation P-Values, Shrinkage, and Regularized Models. Front. Ecol. Evol. 7:486. DOI: 10.3389/fevo.2019.00486

            A couple applications using 2nd generation p-values.

            •Chaganti S, Welty VF, Taylor W, Albert K, Failla MD, Cascio C, et al. (2019) Discovering novel disease comorbidities using electronic medical records. PLoS ONE 14(11): e0225495. https://doi. org/10.1371/journal.pone.0225495
            •PhD Thesis: Sequential Rematched Randomization and Adaptive Monitoring with the Second-Generation p-Value to increase the efficiency and efficacy of Randomized Clinical Trials. Chipman. 2019. Vanderbilt Biostatistics.

            Comment


            • #21
              Dave Airey : If you know Blume then you could ask him if he has Stata codes available. At the moment, I don't want to contact him. I have only superficially started with translating the R-code and I want to treat this translation process as a coding exercise. The calculations itself are simple but translating some of the concepts available in R into Stata code is something which I will have to figure out, at least if the Stata code is supposed to have the same capabilities as the R-code.

              Comment


              • #22
                An update regarding making SGPVs available in Stata:
                I started the translation of the R-code in Stata code. I will make the codes first available on my Github page for those who are interested in testing the codes. The codes are not uploaded yet.

                What has be done:
                • At the moment it is possible to calculate the Second Generation P-Values themselves and the power (function) of the test.
                • The command for calculating the SGPVs understands "vectors" as input the interval estimations.
                • A new wrapper command for calculating the SGPVs after estimation commands. The SGPVs are displayed next to the "1st Generation P-Values". The wrapper works as a prefix-command, but can also be used as a normal command. Prefixing is needed if the SGPVs should be displayed directly after the estimation. Stored estimations work also.
                What is left to do (To-Do List):
                • One-sided intervals/tests for SGPV calculation has not been tested yet. The same goes for the code for dealing with -Infinity and +Infinity.
                • The command for the power function does not understand "vectors" as input yet.
                • The commands for the false discovery risk and the plotting are not yet translated, because they are the most complicated ones for me to translate.
                • The documentation/help files are only partially translated into Stata. For now, the help-files need to be generated with the help of the user-provided -makehlp- command by running the command for the respective ado-file.
                • Clean up the code and properly comment it.
                What I could do:
                • Upload the current version of the commands for volunteers to test
                • Also upload the leukemia dataset used to demonstrate the plotting of the SGPVs
                • Make the dependency from the user-provided -integrate- command more explicit. This command is used for the numerical integration required by bonus statistics about the power function. It will be also used for false discovery rate calculations.
                What you could do:
                • Give hints, suggestions, commands about what I should add, change, etc. with regards to the commands.
                • Be volunteer to test the codes and maybe improve them yourself ;-)

                Comment


                • #23
                  I will look at it and try it. Thanks for posting this.

                  Comment


                  • #24
                    Ok. I will write an update once I have uploaded the files to my Github page. First I want to clean up the code, generate the help-files and improve the documentation.
                    Otherwise, it might happen that only I understand easily how to use these commands.
                    I have also not yet tested the codes under different scenarios, so they might still give the wrong answers.

                    Comment


                    • #25
                      The next update:
                      The first version of the commads can be download from my github page via
                      Code:
                      net install sgpv, from(https://raw.githubusercontent.com/skbormann/stata-tools/master/) replace
                      What has be done since last update:
                      • Translated the command fdrisk to calculate the False Discovery/Confirmation Risks (FCR/FDR).
                      • Added more options to the sgpv wrapper command:
                        • Displays also the fcr/fdr,
                        • Allows to calculate the SGPVs only for specific coefficients
                        • Accepts a properly formatted matrix as input
                        • The default values used for the calculation of the SGPV and FCR/FDR can be changed
                        • nomata option to not use a user-provided command for integration -> the support for this command might be removed in later versions if the numerical integration command integ from Stata is deemed fast enough. I could also simply make the "nomata" option the default.
                        • nobonus option to control which statistics besides SGPVs are displayed
                      • Initial help files for the commands exist now. They contain some examples, but lack some of the explanations from the R-code.
                      • The commands should give the same results as the R-codes. At least all examples from the R-code provided the same results with my translated Stata commands.
                      What is left to do (To-Do List):
                      • One-sided intervals/tests for SGPV calculation have not been tested yet. The same goes for the code for dealing with -Infinity and +Infinity.
                      • The command for the power function does not understand "vectors" as input yet.
                      • The commands for the plotting (plotsgpv) is not yet translated. Unless it is requested, I will postpone the translation. If someone is well versed in both Stata and R-plotting commands than this translation should be easy/easier.
                      • The documentation/help files are mostly translated into Stata, but some longer examples and description are hard to translate correctly due to the limitations of Stata's help file viewer. Overall the format of the help-files could be improved, but for now they are generated the user-provided makehlp command. Fine-tuning of the help files will happen before the release to the SSC.
                      • Clean up the code and properly comment it. The code contains some comments, but they might still need improvement. At some later point, I envision a rewrite of the Stata code into Mata code to make the code more look like the original R-code. For now, I have used some "work-arounds" to circumvent the limitations of the Stata ado-language.
                      • Extend the main sgpv command to support more commands like ttest for example. It might be also nice to make the calculated results as easy exportable as other estimation commads. At the moment, for example an export to the user-provided esttab command is not possible. If this is a desired feature, then I will think about implementing it. For me personally, it is not important yet for my own research.
                      • Use more sensible version numbers: Currently the version number is stated as 0.9 in all commands to indicate that they are not yet released to the SSC might still contain some bugs, but the majority of the expected functions works.
                      What I could do:
                      • Also upload the leukemia dataset used to demonstrate the plotting of the SGPVs. Let me know if you need this example dataset to test the commands. Otherwise I will only upload it after I translated the plotting coomand.
                      • Make the dependency from the user-provided -integrate- command more explicit. This command is used for the numerical integration required by bonus statistics about the power function and FDR/FCR calculations. For now the command is installed quietly in the background if it is not already installed.
                      What you could do:
                      • Give hints, suggestions, commands about what I should add, change, etc. with regards to the commands.
                      • Be volunteer to test the codes and maybe improve them yourself ;-)
                      • It would be even better if you gave me feedback within the next weeks, so that I could release the package to the SSC with some more explanations how SGPVs are calculated and what they can be used for.

                      Comment


                      • #26
                        I work in behavioral medicine. My experience is that reviewers and editors have been trained to be very rigid with respect to p-values. If it's .049 it's significant, if it's .051 it's not. I'm sure most of the folks on this forum would agree that such a discrete dichotomous criterion is not meaningful. But that's often my experience when working with co-authors, editors, and reviewers. I was recently working with folks on an intervention development proposal (not testing) proposal. It's basically a 2 x 2 x 2 x 2 design to evaluate 4 components that could potentially be included in an "optimized" intervention package that would be tested in a future RCT. Folks who are advocating this type of developmental research recommend using a .10 level for evaluating treatment components. Since the goal is to develop, not definitively test, a specific intervention. Not using a traditional .05 level of significance was an issue for some reviewers. I'm heading toward the 18th tee of my career and was certainly trained in frequentest statistics. But the rigidity with which p-values are evaluated, and the almost insistent use of .05 has increasingly bothered me. Aren't there contexts in which Type II errors may be more egregious than Type I errors? I've increasingly been interested in these issues. I've read the ASA statement and there is a special issue of (I think The American Statistician) with around 40 articles talking about ways to move forward. I know I don't have the answers but I sure have a lot of questions. I have a simple mind but have found these introductory articles an enlightening discussion of these issues from a Bayesian point of view. I'd be interested in hearing other thoughts on this stuff.

                        Jarosz, A.F. & Wiley, J. 2014. "What are the odds? A practical guide to computing and reporting Bayes factors." Special issue of Journal of Problem Solving (7): 2-9.

                        Wagenmakers, E.J. 2007. "A practical solution to the pervasive problems of p values." Psychonomic Bulletin & Review 14(5): 779-804.

                        Comment


                        • #27
                          One last update in this thread, I promise ;-)

                          The R-code is now more or less completely translated in Stata.
                          The last remaining function, the plotting one 'plotsgpv', is now also available. It is capable of reproducing the example of the R-code with minor differences.

                          To get the newest version, just run
                          Code:
                          net install sgpv, from(https://raw.githubusercontent.com/sk...-tools/master/) replace
                          I removed all external dependencies.
                          'sgpvalue' can now calculate the SGPVs for matrices larger than the matsize limit via a Mata function. However, this function relatively slow for the example leukemia dataset.
                          An alternative approach using variables is also implemented which is very quick (less than a second for around 7000 observations compared to around 30 seconds in the Mata version).
                          The example leukemia dataset from the R-package is also online now.
                          All commands received some minor changes, bugfixes, etc.

                          I plan to upload the command to the SSC after the documentation is complete. I want to add a bit more explanation what SGPVs are compared to the documentation in the R-code.
                          After that, I want to rewrite the commands and integrate them in the 'spgv' wrapper command. To make the usuage of this command even easier, I want to add a dialog box.
                          The goal is to make the commands look and behave more like typical Stata commands instead of emulating the functionality of the R-functions.

                          Comment


                          • #28
                            Sven-Kristjan Bormann The sgpv package is amazing. Thanks

                            Comment


                            • #29
                              Richard Rovin I am glad to hear that you find the package useful.

                              I have released an update on my Github page.
                              If you have already installed the package, then just run
                              Code:
                              adoupdate sgpv
                              to get the newest version.
                              Otherwise, just run the following commands if you have not installed the package before
                              Code:
                              net from "https://raw.githubusercontent.com/skbormann/stata-tools/master/"
                              net install sgpv.pkg, replace
                              net get sgpv.pkg, replace // to get the leukemia example dataset
                              This release is an intermediate release and hopefully the second last release on Github before I submit to SSC.
                              • Updated the documentation so that it should not be necessary to read the articles about SGPVs to understand the commands and the output of the commands.
                              • There are some changes to the display of the results from the sgpv-command and bugfixes in all commands.
                              • The sgpvalue-command can now handle one-sided intervals similar to the original R-code. Positive and negative infinity are specified by the missing value in the respective options. However, this feature has not yet been tested much and the feature exists mostly to have nearly the same functionality like the R-code.
                              • The plotsgpv-command has now the option twoway_opt(string asis) to make additional changes to the plot. The previous handling of additional graphing options was too error prone.
                              • The example leukemia dataset contains now a better description of its origin and content.
                              I hope that with this update using the package becomes easier than before.
                              Plans/Ideas for the next release:
                              • Add the possibility to work with matrices larger than c(matsize) as inputs for the sgpv-command.
                              • Add subcommands to the sgpv-command, so that the other commands (sgpvalue, fdrisk, sgpower and plotsgpv) can be called/used from sgpv. The goal is to make the calculations of SGPVs easier by not having to remember/learn four different commands, but only one.
                              • Further bugfixes, code clean up and updated documentation.
                              In case you are interested to know what other ideas I have to improve the package, you can find them in at the top of the sgpv.ado file.

                              Let me know if you have any more comments, find new or old bugs, suggestions about the help files, what features to add, etc.

                              Comment

                              Working...
                              X