Wish: To facilitate future replication of Stata results, a StataCorp utility to help users "freeze" a collection of user-contributed ADO, Mata and MLIB programs for publication/posting with the Stata DO files that call those programs
Many more responsible journals require that referees and eventual readers be able to replicate the analytical results in submitted papers. Through support to the Stata Journal and periodic Stata user conferences, Stata also encourages and helps users to produce and publish Stata programs that extend Stata's capabilities in small and large ways. But in the future when a researcher attempts to replicate the results in a published paper, the community-contributed programs originally used might not be available in the same version, or at all.
Thus a user wishing to enable future replication of a set of interlocking DO files and community-contributed ADO/Mata/Mlib files must figure out how to assemble and "freeze" the community-contributed ADO files used in a given research project. This is doable and many users are already doing it, each in his or her own way. But it would be great if there were a set of StataCorp supported conventions and utilities to standardize the process. (Ideally some journals, starting with Stata Journal, would even require that Stata users conform to such StataCorp-recommended conventions and use the recommended program-freezing utilities.)
diana gold 's SSC program -dependencies- seems to me to be an excellent model for a Stata-supported way to "freeze" (her word) a set of user-contributed Stata ADO/MATA/MLIB files in order to facilitate future replication of research results. I like the fact that it allows the future replicator to temporarily modify their -adopath- as they replicate and then undo this change and delete the replication-specific collection of ADO/Mata/MLIB programs at will. Other SSC programs that accomplish some of the same objectives include -zippkg-, -rqrs-, -which_version-, -copycode-, -adolist- and -usepackage-.
I take the point that results produced using the updated community-contributed ADO files may differ from those originally published exactly because the ADO file's bugs have been fixed. The new results might be "better". But I think this is an argument in favor of, rather than against, requiring authors to publish their frozen ADO files as part of a journal submission. I think that replicators need to start with a script that reproduces as exactly as possible the published result, before they experiment to discover the sensitivity of those results to different approaches and/or data. It is the replicator's responsibility to discover that the newer version of the community-contributed program produces a different result.
diana gold, daniel klein Nick Cox, Sergio Correia and others have extensively discussed these issues on these threads:
https://www.statalist.org/forums/for...lable-from-ssc
https://www.statalist.org/forums/for...ge-require-ado
https://www.statalist.org/forums/for...o-local-folder
https://www.statalist.org/forums/for...os#post1523554
https://www.statalist.org/forums/for...79#post1662079
Many more responsible journals require that referees and eventual readers be able to replicate the analytical results in submitted papers. Through support to the Stata Journal and periodic Stata user conferences, Stata also encourages and helps users to produce and publish Stata programs that extend Stata's capabilities in small and large ways. But in the future when a researcher attempts to replicate the results in a published paper, the community-contributed programs originally used might not be available in the same version, or at all.
Thus a user wishing to enable future replication of a set of interlocking DO files and community-contributed ADO/Mata/Mlib files must figure out how to assemble and "freeze" the community-contributed ADO files used in a given research project. This is doable and many users are already doing it, each in his or her own way. But it would be great if there were a set of StataCorp supported conventions and utilities to standardize the process. (Ideally some journals, starting with Stata Journal, would even require that Stata users conform to such StataCorp-recommended conventions and use the recommended program-freezing utilities.)
diana gold 's SSC program -dependencies- seems to me to be an excellent model for a Stata-supported way to "freeze" (her word) a set of user-contributed Stata ADO/MATA/MLIB files in order to facilitate future replication of research results. I like the fact that it allows the future replicator to temporarily modify their -adopath- as they replicate and then undo this change and delete the replication-specific collection of ADO/Mata/MLIB programs at will. Other SSC programs that accomplish some of the same objectives include -zippkg-, -rqrs-, -which_version-, -copycode-, -adolist- and -usepackage-.
I take the point that results produced using the updated community-contributed ADO files may differ from those originally published exactly because the ADO file's bugs have been fixed. The new results might be "better". But I think this is an argument in favor of, rather than against, requiring authors to publish their frozen ADO files as part of a journal submission. I think that replicators need to start with a script that reproduces as exactly as possible the published result, before they experiment to discover the sensitivity of those results to different approaches and/or data. It is the replicator's responsibility to discover that the newer version of the community-contributed program produces a different result.
diana gold, daniel klein Nick Cox, Sergio Correia and others have extensively discussed these issues on these threads:
https://www.statalist.org/forums/for...lable-from-ssc
https://www.statalist.org/forums/for...ge-require-ado
https://www.statalist.org/forums/for...o-local-folder
https://www.statalist.org/forums/for...os#post1523554
https://www.statalist.org/forums/for...79#post1662079
Comment