Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ODK to Stata: Unnamed variables

    Hi everyone,

    I am using ODK to transfer data to Stata collected through electronic surveys. When analyzing the data, some variables lost their names (but luckily retained their labels, so it was easier to identify them and relabel them). Since I'll be repeating the same process of gathering data, I need to know what could have caused this to happen.

    More on the unnamed variables: All of them were labeled as "vNo" (i.e. v24, v25). Also, all of them, besides the last one, were variables belonging to follow up questions, specifically to "specify other". I checked the programming tools on Excel and the constraints were correct.

    Any ideas? Thanks!

    Aguitas

  • #2
    Is this a claim that Stata messed up your data? Or that ODK [whatever that is] or MS Excel did so? Stata can't use the same variable name twice; that's one constraint.

    Comment


    • #3
      My guessing is that in transferring the data some variables lost their names. It could have been ODK (electronic survey platform that transfer data to Stata or in any other format you want), I am not sure.

      5 variables took on the names of v26, 27, v36, v37, v44. So the curious thing is that they all took the "v" in front, and that all of them, besides the last unnamed variable, belong to a follow up question "specify other").

      Comment


      • #4
        Unless someone speaks up with experience of the same software I don't think much more comment is possible without more details and especially a reproducible example. Some times it is easier to show files to StataCorp Technical Support but manifestly they have no responsibility for ODK.

        Comment


        • #5
          How are you transferring the data? Are you importing the .csv files with insheet?

          I wrote odkmeta, a program for importing ODK data along with the metadata from the survey and choices worksheets of the XLSForm. You can install it by typing ssc install odkmeta.

          The odkmeta help file notes the following about ODK field names:

          ODK field names follow different conventions from Stata's constraints on variable names.
          Further, the field names in the .csv files are the fields' "long names," which are
          formed by concatenating the list of the groups in which the field is nested with the
          field's "short name." ODK long names are often much longer than the length limit on
          variable names, which is 32 characters.

          These differences in convention lead to three kinds of problematic field names:

          1. Long field names that involve an invalid combination of characters, for example,
          a name that begins with a colon followed by a number. insheet will not convert
          these to Stata names, instead naming each variable v concatenated with a
          positive integer, for example, v1.
          2. Long field names that are unique ODK names but when converted to Stata names and
          truncated to 32 characters become duplicates. insheet will again convert these
          to v# names.
          3. Long field names of the form v# that become duplicates with other variables that
          cannot be converted, for which insheet chooses v# names. These will be
          converted to different v# names.

          Comment


          • #6
            Matthew: You have exemplary documentation there.

            Comment


            • #7
              Hi Matthew,

              Thanks so much for your post. Yes, we were importing the files to .csv format with insheet (we are using odkmeta).

              Probably case #2 was what happened with us, so we'll have to confirm this, but this is the answer I was looking for.

              Thanks again!

              Comment


              • #8
                Hello
                I'm trying use odkmeta and i generated one do-file but i can't compile I have long names because my XLM form has concatenate groups and repeats forms. I tried put in do-file the code for shortnames show in odkmeta's help but i'm not sure where is the correct place for put in it. You may help me? thanks. regards

                Comment


                • #9
                  Hi lisset,

                  I have been using odkmeta to export odk data(csv) to Stata and along the way have been able to learn several things that one need to take care while doing the ODK programming. It takes time to discover some of these tweaks. One of the issues is the variable names. In ODK programming we may use groups or nested groups. The variables names in the csv will consist of group names (and if any, nested groups). For instance, you have a group called "household_roster", then a nested group "members_present", and a final nested group (in "members_present" group) named children_under18. Within this group you then have a variable gender. The variable name for gender will be household_roster-members_present-children_under18-gender. If you look at these variable name it has 56 characters (more than Stata's 32). When this variable will be insheeted in Stata it will be assigned the name v#. One may deal with this before while doing the ODK programming or modify the odkmeta generated do file(This you must be very cautious!)

                  Comment

                  Working...
                  X