Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • file ".dta" not Stata format

    I am trying to use data from external hard drive. Every time I begin by trying to "use" the data, I get an error message:
    Code:
    use "data/file2/myData.dta", clear
    file data/file2/myData.dta not Stata format
    However, if I re-run the codes, save the file using the same name. And then try using that .dta file again, this time it works. It means I have this error, only when I eject the external drive and use it again.
    I am using Stata version 14.2
    Can anyone please help?

  • #2
    Depends on the details. Some possibilities:

    1. Your "save the file" is not using save (which does save the current dataset) but saving code using the do-file editor.

    2. Somehow a later version of Stata is involved too, and your Stata can't read what it produces.

    Either way,

    Code:
    type data/file2/myData.dta
    may helpm as the first few lines are quite informative. Otherwise put, the most useful things to tell us are exactly what you're doing (meaning the exact commands -- "save the file" is too vague) and exactly what the file contains.

    Comment


    • #3
      Thank you Nick Cox for your prompt response.

      1. I saved the dataset using the 'save' command in the do-file editor as:
      Code:
      save "data/file2/myData.dta", replace
      2. I am 90% sure that I saved the data using only 14.2 version of Stata. But as 90% is not 100%, I am trying to make sure that I keep a track of it this time. I will update you about it soon.


      I typed
      Code:
       type data/file2/myData.dta
      as you suggested, but as my dataset is very big (~225 variables and ~38,000 observations), I am unable to see the first few lines of the results. The last few lines are also not very comprehensive. It looks like this:
      Code:
       0.2. -20h.3. 20-40h.4. 40-50h.5. 50-60h.6. 60h-.</lbl><lbl>�...year.ime.at3.23.diedCat.tsCat.at...3�
      > @......)......`Eq��..�Dq��..�2:.....0 �.....�Eq��..�).......�.......Eq��..=} ......�......
      ...Z.......     ...........$...-...6...?...H...Q...
      ...........
      ...........................10. 2010.11. 2011.12. 2012.13. 2013.14. 2014.15. 2015.16. 2016.17. 2017.18. 2018.19. 2019.</lbl><lbl>(...young.
      > 3.23.diedCat.tsCat.at...3�@......)......`Eq��..�Dq��..�2:.....0 �.....�Eq��..�).......�.......Eq��..=}      ......�.....................
      > .........0. none.1. some.</lbl></value_labels></stata_dta>


      The dataset contains mostly factor and numeric variables such as age, sex. I am sorry I am not allowed to share anything about the data online, so I cannot show even a screenshot of the dataset or the actual variable names. I am using more or less the following commands in do-file editor. The pattern I used for naming my .dta file is the same.
      Code:
      cd "/Volumes/HDPH-UT/survey_c"
      use "7th-co(la).dta", clear
      keep year sex
      save "data/7th_co_s2/7thco_s2_1.dta", replace
      use "data/7th_co_s2/7thco_s2_1.dta", clear
      
      generate ageCat = 1 if age < 40
      save "data/7th_co_s2/7thco_s2_2.dta", replace
      use   "data/7th_co_s2/7thco_s2_2.dta", clear
      
      
      generate sex = 1 if gender == "Male"
      save "data/7th_co_s2/7thco_s2_3.dta", replace
      use   "data/7th_co_s2/7thco_s2_3.dta", clear
      If I open the do-file editor and change my working directory (running the cd command above) and try opening the file "7thco_s2_3.dta", it would give me an error. I would have to then, re-run all the codes sequentially from the beginning, after which it will work.
      Last edited by bibha dhungel; 26 Apr 2022, 21:38.

      Comment


      • #4
        In #3 what you describe at point 1 looks fine to me.

        Point 2 indicates that it really is a .dta file that you're reading in but doesn't let me guess what is the problem. Sorry not to have a better idea.

        Comment


        • #5
          All those Unicode replacement characters (�) make me think that an earlier version of Stata or another software that does not use Unicode encoding was involved in the process. I am not sure how that would lead to the reported error message, though.

          What is the output for

          Code:
          dtaversion data/file2/myData.dta
          Last edited by daniel klein; 27 Apr 2022, 03:06.

          Comment


          • #6
            Hi daniel klein, thank you for your suggestion. I tried opening the data file again today. This, time, I am 100% sure that I saved the .dta file in my external drive using only a single laptop and a single version of Stata 14.2.
            The file does not open. When I type in dtaversion followed by data name as you suggested, I get the following error:

            Code:
             dtaversion "data/7th_co_s2/7thco_s2_7.dta"
            
            file not Stata .dta file or ...
                File data/7th_co_s2/7thco_s2_7.dta is not a known .dta-file format.  The file might
                be a horribly corrupted .dta file, but the most likely explanation is simply that the file is not
                a .dta file.

            But if I rerun my codes to save the file and then open it again, the same code works.

            Code:
            use "data/7th_co_s2/7thco_s2_6.dta", clear  
            generate ageCat = 1 if age < 40  
            save "data/7th_co_s2/7thco_s2_6.dta", replace  
            use   "data/7th_co_s2/7thco_s2_7.dta", clear // this code would WORK now

            Comment


            • #7
              I have re-read your story and I do not follow. You are telling us that you cannot use a specific file. But when you use that file (which you have just claimed not to be able to) and save it, then you can subsequentially use it. This makes no sense. If you cannot use the file in the first place, how are you able to save it?

              Also, as usual, details are crucially important here. Note that

              Code:
              use "data/7th_co_s2/7thco_s2_6.dta", clear
              is not the same file as

              Code:
              use "data/7th_co_s2/7thco_s2_7.dta", clear // this code would WORK now
              and I seriously doubt that the latter command would indeed "work" if it did not before. Are you keeping other information from us? How many files are involved? Are they all located in the same directory?

              Try

              Code:
              unicode analyze "data/7th_co_s2/7thco_s2_7.dta"
              and show us the output.
              Last edited by daniel klein; 07 May 2022, 00:51.

              Comment


              • #8

                I made a small typo there, the above code should have been: This did not work: [directly opening file 7]

                Code:
                 cd "/Volumes/HDPH-UT/survey_c"
                use "data/7th_co_s2/7thco_s2_7.dta", clear
                But this works: [first opening file 6, rennuning codes in between. saving the file as file 7. now using file 7]

                Code:
                cd "/Volumes/HDPH-UT/survey_c"
                
                use "data/7th_co_s2/7thco_s2_6.dta", clear  //using file 6 as it works
                
                generate ageCat = 1 if age < 40  
                
                save "data/7th_co_s2/7thco_s2_7.dta", replace  // saving as file 7  
                
                use   "data/7th_co_s2/7thco_s2_7.dta", clear // using file 7 would WORK now
                Here is my code flow in do file:

                Change working directory
                Use main data file save as file 1
                use file 1
                run some codes
                save as file 2

                use file 2
                run some codes
                save as file 3

                use file 3
                run some codes
                save as file 4

                use file 4
                run some codes
                save as file 5
                Code:
                  
                
                 cd "/Volumes/HDPH-UT/survey_c" use "7th-co(la).dta", clear
                keep year sex
                save "data/7th_co_s2/7thco_s2_1.dta", replace  
                use "data/7th_co_s2/7thco_s2_1.dta", clear
                
                generate ageCat = 1 if age < 40
                save "data/7th_co_s2/7thco_s2_2.dta", replace  
                use   "data/7th_co_s2/7thco_s2_2.dta", clear
                
                generate sex = 1 if gender == "Male"
                save "data/7th_co_s2/7thco_s2_3.dta", replace  
                
                use   "data/7th_co_s2/7thco_s2_3.dta", clear
                generate year = 1 if gender == "Male"  
                save "data/7th_co_s2/7thco_s2_4.dta", replace  
                
                use "data/7th_co_s2/7thco_s2_4.dta", clear
                generate height = 1 if gender == "Male"  
                save "data/7th_co_s2/7thco_s2_5.dta", replace  
                
                use "data/7th_co_s2/7thco_s2_5.dta", clear
                generate weight = 1 if gender == "Male"  
                save "data/7th_co_s2/7thco_s2_6.dta", replace  
                
                use "data/7th_co_s2/7thco_s2_6.dta", clear
                generate bmi = 1 if gender == "Male"  
                save "data/7th_co_s2/7thco_s2_7.dta", replace  
                
                use "data/7th_co_s2/7thco_s2_7.dta", clear
                generate sex = 1 if gender == "Male"
                Please ignore the generate command in between. I use generate command to generate different variables. Around 10 files are involved. All of them are in the same directory. Surprisingly, when I first posted 3-4 files out of the 10 files were not working. This time, only file 7 was not working. I tried:

                Code:
                  
                 unicode analyze "data/7th_co_s2/7thco_s2_7.dta"
                When I run the above unicode analyze command with data loaded, I get an eror message:

                Code:
                unicode analyze "data/7th_co_s2/7thco_s2_7.dta"
                no; data in memory would be lost    
                unicode syntax is         unicode analyze      filespec        
                unicode encoding set encoding        
                unicode translate    filespec [, ...]        
                unicode retranslate  filespec [, ...]        
                unicode restore      filespec [, ...]      
                
                analyze and [re]translate can handle Stata datasets as well as text files such as do-files,     ado-files, help files, etc.      
                
                There must be no data in memory.  See help unicode. r(4); 
                end of do-file
                Now when I first clear the data in memore, then run the command, I still get an error message:

                Code:
                  
                unicode analyze "data/7th_co_s2/7thco_s2_7.dta"
                filespec invalid    
                You specified data_by_me/9th_cohort_s2/9thcohort_s2_7.dta.  All files to be analyzed or translated     must be in the current (working) directory (folder).  
                Use the cd command to change directories. r(198);  
                end of do-file
                I do not understand why the error message states "file is not in the working directory." If my file is not in the working directory, why would the following command work and load the data.

                Code:
                use  "data/7th_co_s2/7thco_s2_7.dta", clear
                PS. all my files are now working as I re-ran all the codes. For the files to not open I might need to wait a few days before I open the data again from external drive.
                Last edited by bibha dhungel; 07 May 2022, 01:35.

                Comment


                • #9
                  Thanks for the clarification.

                  Originally posted by bibha dhungel View Post
                  I do not understand why the error message states "file is not in the working directory." If my file is not in the working directory, why would the following command work and load the data.
                  This is indeed a good question. It appears that there is something going wrong with accessing the "external drive". Perhaps you do not have writing permission (unlikely if the save commands do not issue an error message) or perhaps the files are repeatedly modified by someone other than you. Your statement

                  Originally posted by bibha dhungel View Post
                  For the files to not open I might need to wait a few days before I open the data again from external drive.
                  points into this direction. Why would something be different in a couple, of days? If you do not have full control over the "external drive" then you should probably not save anything there and especially not replace existing files. If you cannot make private copies of the files for legal and/or technical reasons, then you should probably contact the party that is primarily responsible for the contents of the "external drive" and clarify your issues with them. There is nothing we can do in that case.

                  Comment


                  • #10
                    Thank you for your prompt response. I am not sure how would things be different in a couple of days. No one except I use the external drive and I use it only for the purpose of this current dataset. Initially, I thought the problem might re-occur every time I eject the device and use it again. However, when I ejected it and used the commands immediately, it worked perfectly. Then when I get back to working with the files after a few weeks, the problem re-occurred.

                    I really appreciate your suggestion. I believe this might be the issue with the external drive, I will replace it before it is too late.
                    Thank you.

                    Comment


                    • #11
                      Originally posted by bibha dhungel View Post
                      I believe this might be the issue with the external drive, I will replace it before it is too late.
                      I think you are right. You should replace that drive (or at least make backups). Often manuals for external drives include warnings not to simply unplug the device because this could damage the stored data. In my personal experience, I have never encountered any problems with that but I might have just been lucky.

                      Comment


                      • #12
                        Hi again daniel klein, Here is an update:
                        I changed my working directory so that I could enter just the file name after unicode analyze and it worked. Here is the output. I would in any case be replacing my device, but just wanted to give the update:

                        Code:
                         cd "/Volumes/HDPH-UT/survey_c/data/7th_co_s2"    
                        
                        unicode analyze "7thco_s2_7.dta"  
                        
                        File summary (before starting):
                                1  file(s) specified
                                1  file(s) already known to be UTF8  in previous runs
                                0  file(s) to be examined ...
                          (nothing to do)
                        Last edited by bibha dhungel; 07 May 2022, 04:36.

                        Comment

                        Working...
                        X