Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape Best to Worst dataset

    Dear all,
    I turn to you for a second request for help linked to the question at the link: https://www.statalist.org/forums/for...haping-to-long.
    I would need to reshape a Best to Worst Scaling (BTWS) dataset from Sawtooth Software.
    For those unfamiliar with it, the BTWS method is based on surveys where the respondent is asked to indicate between a set (CHOICE SET) of 4 alternatives (ITEMS-A) the best and the worst in his opinion. This experiment is repeated several times. In our case the choice sets are 8. In addition to the BTWS exercise, in the questionnaire we collected socio-demographic information such as income, age, sex, and so on. The Sawtooth Software allows to obtain the ranking of the items by making the difference between best and worst, but we want to develop logit analysis so that we can also introduce other variables in the predictive model.
    The software provides as a result a dataset composed as follows:
    ID | B1 | W1 | B2 | W2 | B3 | W3 | B4 | W4 | B5 | W5 | B6 | W6 | B7 | W7 | B8 | W8 | AGE | INCOME
    where ID is the indicator of the interviewee, B1 is the "best" option chosen by the interview in the first CHOICE SET and W1 is the "worst" option chosen by the interview in the first CHOICE SET, B2 and W2 in the second CHOISE set, and so via up to B8 and B8. AGE and INCOME are the socio-economic variables for each interviewee.

    In order to develop a Logit model, the dataset must have a different format.
    Starting from the vector built for each interviewee, I should obtain the dataset in this way:
    ID; Interviewee_ID; Choice Set; A1; A2; A3; A4; A5; ...; A24; CHOICE; AGE, INCOME
    1; 1; 1; 0; 1; 0; 0; 0; ...; 0; 0; 64; 10000
    2; 1; 1; 1; 0; 0; 0; 0; ...; 0; 1; 64; 10000
    3; 1; 1; 0; 0; 0; 0; 1; ...; 0; 0; 64; 10000
    4; 1; 1; 0; 0; 0; 0; 0; ...; 1; 0; 64; 10000
    5; 1; 1; 0; -1; 0; 0; 0; ...; 0; 0; 64; 10000
    6; 1; 1; -1; 0; 0; 0; 0; ...; 0; 0; 64; 10000
    7; 1; 1; 0; 0; 0; 0; -1; ...; 0; 1; 64; 10000
    8; 1; 1; 0; 0; 0; 0; 0; ...; -1; 0; 64; 10000
    9; 1; 2; 0; 0; 0; 1; 0; ...; 0; 1; 64; 10000
    10; 1; 2; 0; 1; 0; 0; 0; ...; 0; 0; 64; 10000
    11; 1; 2; 0; 0; 0; 0; 1; ...; 0; 0; 64; 10000
    12; 1; 2; 0; 0; 0; 1; 0; ...; 0; 0; 64; 10000
    13; 1; 2; 0; 0; 0; -1; 0; ...; 0; 0; 64; 10000
    14; 1; 2; 0; -1; 0; 0; 0; ...; 0; 1; 64; 10000
    15; 1; 2; 0; 0; 0; 0; -1; ...; 0; 0; 64; 10000
    16; 1; 2; 0; 0; 0; -1; 0; ...; 1; 0; 64; 10000
    ...

    Where the first variable indicates the ID of the response in the entire dataset, Interviewee_ID indicates the ID of the respondent, Choice set indicates the group of alternatives proposed for each choice set (1-8) to be selected as best and worst, variables A1 to A24 indicate the different items, where it takes value 1 if the alternative is in the group of possible BEST solutions, -1 if in the WORST (for each set there are 4 BEST and 4 WORST alternatives, and in the dataset the BEST and then the WORST must appear first), CHOICE indicates the choice made by the interviewee (where 1 indicates the line where the interviewee's choice is described), then AGE and INCOME. In the example, only 2 choice sets were shown for the first interviewee. For each respondent we should have (4 + 4) * 8 = 64 lines.
    In the example, in CHOICE SET 1, respondent 1 chose A1 as Best solution, and A5 as Worst. While in CHOICE SET 2, interviewee 1 chose A4 as Best solution, and A2 as Worst.

    Is it possible to carry out this transformation of the dataset into STATA?

    Many thanks in advance

    Federico

  • #2
    Thanks to all. I solved the problem using SPSS.

    Comment


    • #3
      Hi Federico, I'm using the BWS-Case 2 approach in my research. I have 28 Attribute-levels (AL) and a Choice set of 5 AL, shown 10 times. I managed to restructure the data set in Stata as follows;
      Respondent ID, Choice set, AL1.....AL28, Choice, Income, gender, age. And I ran a logit model that produced utility scores. I'd like to understand how the model is able to identify the choice set. This is confusing me. In SPSS how does the software know-how to identify each choice set(profile)? Did you have to specify the number of rows in each profile when coding?

      Comment


      • #4
        Federico DellAnna Just wanted to check with you what command you use in STATA to model Best-Worst scale data you generated above. I tried using the cmxtmixlogit command, but this data layout does not support this command.

        Comment


        • #5
          Hi Federico DellAnna Udeni DeSP Wizaso Munthali , I am using the BWS2 approach. If I have transformed the BWS responses to the format as you mentioned in the first post: "ID; Interviewee_ID; Choice Set; A1; A2; A3; A4; A5; ...; A24; CHOICE; AGE, INCOME", then just wondering how you run the clogit model with this data set? Should I omit one attribute level, e.g, A24, and run the command as below:

          clogit CHOICE A1 A2 A3 ... A23, group(ID)

          Is that the correct way to code? Or do I have to add anything in the data set?

          I noticed that the coefficients did not change when I alternated the omitted variables or when I included all 24 attribute levels in the models.

          Any advice is very much appreciated!

          Comment

          Working...
          X