Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • When writing a program, does the 'seed' option always require a default?

    I am writing a program that allows the user to specify a seed for reproducibility. This is an extract of a much larger program I am writing. For illustration purposes, let us say I would like to split the data into n (e.g., 10) equal-size folds. The program would be:

    Code:
    sysuse auto, clear
    
    capture program drop myprogram
    program define myprogram
        syntax [, folds(int 10) seed(int 1)]
        
            set seed `seed'         // seed to reproduce the splitting of the data
            
            capture drop foldid
            qui xtile foldid = uniform() , nq(`folds')
    end
    
    myprogram, seed(1)
    Is it possible to have the option of setting a seed but not define a default value? In other words, I would like the program to split the data in any way it likes when a seed has not been specified but still allow the option to set a seed for reproducibility?

    Please note that splitting folds is not the focus of this topic. I am sure there are ways to set a seed for splitting data but this is not of interest for this topic.

  • #2
    I am not sure that I understand the question or why you're mentioning splitting at all.

    You can specify that seed() takes on a numlist and then do what you like depending on whether that option was specified.

    If the user didn't specify a seed() that way, it's an empty string after syntax:

    Code:
    if "`seed'" != "" set seed `seed' 
    else set seed 1
    If that's doesn't help and you don't get a better answer, you may need to expand on your question.

    Comment


    • #3
      It is also good practice that the program you write not overwrite (silently or otherwise) any seed the user may have set previously, because this makes troubleshooting reproducibility issues a pain. The -seed()- option should only change the seed if explicitly requested by the user.

      Comment


      • #4
        Hi Nick,

        Many thanks for your quick response. Yes, this has helped enormously.

        The reason for mentioning splitting was just as an illustration. It also helps to see whether any solution delivers the desired effect. I can see that your solution does indeed create the same folds when the seed is specified, but also, crucially, allows the observations to be in different folds each time the program is run when a seed is not specified. (Just a quick confirmation it is working).

        Leonardo, that is a good point. I will make sure that it does not overwrite any seed the user has specified previously. Many thanks again!

        Comment

        Working...
        X