Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Programming? Picking 4 of 10 variables based on rank.

    It's social network data with which I've never worked. Persons can identify up to 10 persons of importance in their social network and are asked how important that person is to them rated 0 "not at all" to 4 "extremely". For each person identified they are also asked questions about how frequently they are in contact with that person, how much that person uses alcohol, etc. In some cases the coding protocol (I didn't develop it so I'm just following the protocol that used by folks in the field; it's not how I'd do it) asks to select up to 4 persons with the highest importance. So, for each subject I have up to 10 importance variables. They could be coded something like 4, 4, 2, 3, 4, 3, 4, 1, ., .,. I'm unable to identify a relatively convenient way to select the 4 variables identifying the most important persons from among the 10 possible persons. I've thought of reshaping to long format, but I don't see a convenient solution doing that either. Thanks.

  • #2
    Long format is almost always the way to go in Stata if you have multiple instances of the same measure within observation. Does this look like what you want? You could of course go back to wide format afterward.

    Code:
    // Create some simulated data for illustration.
    // I'm guessing on format.
    clear
    set seed 47554
    set obs 3
    gen int id = _n
    gen byte howmany = 10 in 1   // how many did this person name?
    replace howmany = 5 in 2
    replace howmany = 1 in 3
    forval i = 1/10 {
        gen byte import`i' = ceil(runiform() * 4) if `i' <= howmany
        gen somex`i' = runiform() if `i' <= howmany
    }
    // end simulate data
    //
    reshape long import somex, i(id) j(person)
    // Within ID, sort observations with the highest importance scores to the top.
    // Tiebreaker is presumably necessary with range of 1/4
    gen tiebreak = runiform()
    gsort id -import tiebreak  // gsort to get big scores to top
    // Tag the top 4 within ID, among those that have an importance score
    by id: gen byte intop4 = (_n <=4 ) & !missing(import)
    list id import intop4 if intop4 ==1

    Comment


    • #3
      Thanks Mike! The basic solution came to me after I went to bed last night. Easy solution but just couldn't see it in my head yesterday. Thanks again.

      Comment

      Working...
      X