Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PISA Data, using SPSS/SAS control files

    Colleagues,

    out of personal interest, I wanted to download PISA 2012 data. The data is disseminated in TXT format together with SPSS and SAS control files. I was wondering whether it would be possible to read this data into Stata automatically without the need to translate control files line by line or importing the data to SPSS and then using export to Stata or usespss (SSC)?
    Kind regards,
    Konrad
    Version: Stata/IC 13.1

  • #2
    Konrad,

    I'm not aware of a way to read SAS files directly into Stata. Most people here seem to use StatTransfer (Circle Systems) to convert SAS files to Stata. It costs $175 or so, but you can download a trial copy to test it out.

    Regards,
    Joe

    Comment


    • #3
      Importing to SPSS and exporting to Stata is really the simplest option (passing through R's foreign library perhaps instead of StatTransfer).

      Translating the SPSS syntax files would not be very difficult (and with a good text editor, even easier) since the content is simple:
      1. read fixed format data
      2. apply variable labels
      3. apply value labels (though problems with labels attached to codes stored as strings)
      4. apply missing values
      It would be tedious, however.

      Brendan

      Comment


      • #4
        The Pisa files are huge. The SPSS Student Syntax file just by itself has nearly 9,000 lines. If you don't have SPSS, I would try to become good friends with someone who does. Converting the code would be both tedious and error-prone. Converting from SPSS format to Stata format is trivial.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          PSPP should be able to run simple formatting SPSS scripts. Give it a try. It's free.

          In general however the software packages are running their own scripts and not the scripts of the other systems. So either the data provider provides something compatible, or you have to convert either the input or the output of another system (manually or automatically). I believe this will continue until either the systems become intelligent enough to understand your intend regardless of the syntax, or until one system (Stata dominates the other ones and the problem of multiple systems/languages disappears.

          Best, Sergiy

          Comment


          • #6
            I am not sure if this will do what you need, but check out
            Code:
            ssc install PISATOOLS
            It might be a helpful start.

            Comment


            • #7
              Somebody else asked me about Pisa2012 recently so I broke down and played around with this a bit. If I have done this right, Stata 12 versions of the 5 Pisa 2012 files are available at

              https://drive.google.com/folderview?...k0&usp=sharing

              I ran the SPSS programs and then used Stat/Transfer to convert to Stata 12. If there is a problem with the SPSS programs and/or Stat/Transfer, you'll have to figure out what to do yourself, since (at least for the moment) I know very little about these data. Files subject to being deleted if I run out of space! The 5 files take up 1.3 GB and the Student file is about 1 GB pf that.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                From the first look, the biggest problem is that the original syntax labels the string values, such as
                "TUR" "TURKISH" and this feature is not supported in Stata. At the same time skipping the value labels for strings would mean skipping almost all of them. Many strings appear to be just numbers stored as strings, so it might be possible to first destring the values, then apply labels to them. It would be interesting to see what did you do exactly to produce the files you shared. Also, sharing a do-file would be radically more efficient, since then it both answers the question "how" and saves the space in the shared folder. Best, Sergiy Radyakin

                Comment


                • #9
                  The data and spss files are all at the original link: http://pisa2012.acer.edu.au/downloads.php

                  All I did was download the data and programs, run the programs in SPSS, save the resulting SPSS data files, and then use Stat/Transfer to convert to Stata 12 format. I never actually used Stata! However it is that Stat/Transfer handles incompatibilities across programs is how they got handled. If the incompatibilities are too great then somebody may just need to break down and write Stata code rather than just convert. I usually have very good luck with Stat/Transfer but I can't guarantee it always works as well as one would like.
                  Last edited by Richard Williams; 28 May 2014, 14:28.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    I added a readme.txt file that explains where the files came from and what potential users should be aware of. Once I downloaded the SPSS files, it only took around 10 or 15 minutes to create the Stata files. If they are useful, great. If not, I will just get rid of them.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    StataNow Version: 19.5 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      Similarly to Brendan I tend to use R whenever converting data files from different formats. Having said that, I encountered some difficulties when using write.dta command to export files to Stata. For instance, some variables had illegal names using "-" characters. I had to pragmatically remove them in R via the gsub command in order to get "clean" and nice Stata data sets. I would trust that the Stat/Transfer is a more robust solution but presently I don't have access to this software. I did have a look at the pisatools (SSC) but my understanding is that the program derives estimates for specific countries according to set of criteria. At this stage, I am only interested in making PISA data readable in Stata.

                      Richard, I downloaded the files that you were kind to share, this is much appreciated.
                      Kind regards,
                      Konrad
                      Version: Stata/IC 13.1

                      Comment


                      • #12
                        Stat/Transfer is a wonderful program. I probably only use it a few times a year, but it is invaluable if you do. If you are at a University that has Stata's Gradplan it is fairly cheap.

                        I ran a few frequencies and they matched the codebook. But if you see problems, let me know. There are a few settings in Stat/Transfer that can be tweaked.

                        One thing I noticed is that all SPSS missing values got coded as system missing, i.e. . There is an option in Stat/Transfer that will recode them to .a, .b, .c, etc. I am not sure why that isn't the default but in any event I have reset it in my Stat/Transfer program. I will rerun and, if all goes well, upload the revised versions.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        StataNow Version: 19.5 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          I'm surprised not to hear Sergiy's -usespss- mentioned here as part of a solution, thus avoiding the need for Stat/Transfer. In my casual experience, -usespss- reads SPSS files faster than SPSS reads them! So, a free solution using PSPP, then -usespss- seems possible. The one catch here is that -usespss- requires a 32-bit version of Stata, or at least did the last time I used it.

                          Regards, Mike

                          Comment


                          • #14
                            I have 64-bit Stata and Stat/Transfer, so I never use -usespss-, though I have recommended it to others. I just tried usespss on a 32 bit machine and it seemed to work fine with the Pisa data. I believe Sergio has been fiddling around with a 64 bit version of usespss, which could be very helpful since, I suspect, most people (or at least more and more) use 64 bit Stata now. Stat/Transfer will, of course, let you work with many more file formats.
                            -------------------------------------------
                            Richard Williams, Notre Dame Dept of Sociology
                            StataNow Version: 19.5 MP (2 processor)

                            EMAIL: [email protected]
                            WWW: https://www3.nd.edu/~rwilliam

                            Comment


                            • #15
                              Originally posted by Richard Williams View Post
                              I will rerun and, if all goes well, upload the revised versions.
                              This is greatly appreciated, thank you very much.

                              Originally posted by Mike Lacy View Post
                              In my casual experience, -usespss- reads SPSS files faster than SPSS reads them!
                              This is my impression as well, usespss is extremely handy. Read.spss in R is also rather efficient, by combining it together with the write.dta function it is possible to manufacture poor man's Stat/Transfer, minus minor shortcomings associated with variable naming/types that often have to be tackled through explicit functions. Further on the matter of syntax, the PISA syntax that we mentioned earlier does not appear to be particularly complex. As a more generic points it is slightly disappoint that OECD disseminates SPSS and SAS control files only.
                              Kind regards,
                              Konrad
                              Version: Stata/IC 13.1

                              Comment

                              Working...
                              X