Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cannot run the r_ml_stata code

    Hello everyone,

    I have been trying to run a machine learning model using the code r_ml_stata on stata 18. Please see the following codes

    clear all
    cd "F:\SUST_Project\Mosk 2023\Data\Output"
    global learner "nearestneighbor"
    sysuse "MLDATA.dta", clear
    global y " DEPRES"
    global x " SSS L_STRESS W_STRESS AGE"
    splitsample , generate (vsplit, replace) split (0.80 0.20)show rseed (1010)
    ssc install r_ml_stata
    search r_ml_stata

    preserve
    keep if vsplit==1
    drop vsplit
    save data_train, replace
    restore

    preserve
    keep if vsplit==2
    drop $y
    drop vsplit
    save data_test, replace
    restore

    preserve
    keep if vsplit==2
    keep $y
    gen index=_n-1
    save test_y, replace
    restore

    use data_train, clear

    r_ml_stata $y $x , mlmodel ($learner) inprediction ("in_pred") ///
    cross_validatin ("CV") out_sample ("data_test") ///
    outprediction ("out_pred") seed(10) save_graph_cv ("graph_cv")

    all the codes work really fine. However, the last command with r_ml_stata doesn't run on STATA19 software as it shows the unrecognized r_ml_stata code working. I am not sure what mistakes I have made. Your valuable suggestions would help me a lot. Thank you a lot.

  • #2
    The command is integrated with python. You got that setup?

    Comment


    • #3
      I have a similar problem, and I have Python and all recommended packages installed. Using the boston example, Stata 17 MP responds with an error message after running this command: r_ml_stata_cv $y $X , mlmodel("tree") data_test("boston_test") default prediction("pred") seed(10).

      The error message is:
      File "<stdin>", line 1
      ################################################## ##############################
      SyntaxError: invalid syntax
      (1041 lines skipped)
      (error occurred while loading r_ml_stata_cv.ado)
      r(7102);

      Comment


      • #4
        I emailed Giovanni Cerulli, and he sent back a response that worked. To share and help others, here is his suggestion:

        I suggest to use ANACONDA to install Python.

        To get this setting, follow this:

        * Python installation in Stata

        * Look at how many installation of python you have
        . python search

        * Look at which one is the current one
        . python query

        * Change the python executable
        . python set exec "/Users/XXXXX/opt/anaconda3/bin/python" , permanently

        Comment


        • #5
          Hi all, I am encountering this exact problem, and I'm not sure how to use Giovanni Cerulli's suggestion. Please forgive me- I'm brand new to python. I installed ANACONDA, and from what I understand, Python is automatically installed as well.

          I'm getting the following output from Giovanni's suggestion:
          Code:
          . * Python installation in Stata
          . 
          . * Look at how many installation of python you have
          .   python search
          ------------------------------------------------------------------------------------------------------------------------
           Python environments found:  
           /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/Current/bin/python3
           /usr/bin/python3
           /usr/local/bin/python3
          ------------------------------------------------------------------------------------------------------------------------
          
          . 
          . * Look at which one is the current one
          .   python query
          ------------------------------------------------------------------------------------------------------------------------
              Python Settings
                set python_exec      /usr/local/bin/python3
                set python_userpath  
          
              Python system information
                initialized          no
                version              3.12.2
                architecture         64-bit
                library path         /Library/Frameworks/Python.framework/Versions/3.12/lib/libpython3.12.dylib
          
          . 
          . * Change the python executable
          .   python set exec "/usr/local/bin/python3" , permanently
          (python_exec preference recorded)
          I'm still getting the same error, however:
          File "<stdin>", line 1
          ################################################## ##############################
          SyntaxError: invalid syntax
          (1041 lines skipped)

          when I run
          Code:
          r_ml_stata_cv
          .

          In case it's helpful, here's the toy example.
          Code:
          * Load intial dataset
            sysuse auto, clear
          
          * Form the train and test datasets
            get_train_test , dataname("auto") split(0.80 0.20) split_var(svar) rseed(101)
          
          * Form the target and the features
            global y "price"
            global X "mpg rep78 headroom trunk weight length foreign"
          
          * Run tree regression in default mode
            use auto_train, clear
            r_ml_stata_cv $y $X  , mlmodel("tree") data_test("auto_test")  default prediction("pred") seed(10)
          I'm guessing I'm doing
          Code:
          python set exec
          incorrectly. Any help?

          Comment


          • #6
            I’ve encountered the same issue. Could you please let me know if the problem has been resolved? Thank you!

            Comment


            • #7
              When I run the r_ml_stata command, STATA attempts to download the Python package PANDA: 'Trying to install pandas automatically with pip... ';
              but fails to do so. I seem to have successfully installed the package right in Python. The advice on the STATA Forum recommends installing Python with ANACONDA. I'm not sure what that means.

              Here is more of the error message if that helps:

              Trying to install pandas automatically with pip...
              File "<stdin>", line 1
              install_mod("C:\Users\osell1\AppData\Local\Program s\Python\Python312\python.exe","pandas
              > ")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
              SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: trunc
              > ated \UXXXXXXXX escape
              r(7102);


              Comment


              • #8
                I will update my previous post since I figured out why STATA wouldn't talk to Anacoda. Note that my path includes \AppData\ . . ., Be sure to save your Anaconda in the path
                that Giovanni Cerulli notes in the above post.

                Comment

                Working...
                X