Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • List unique obs from appended dataset

    clear
    input long ID str100 org_name str16 datasource
    1234456 "Helping Hand" "Survey1"
    1234456 "Helping Hand" "Survey2"
    9089081 "Save Children" "Survey1"
    9089081 "Save Children" "Survey2"
    9089081 "Save Children" "Survey3"
    4532901 "Planet Earth" "Survey1"
    end

    I am interested in creating a list of organizations that have participated in all three surveys survey1, survey2 & survey3 (variable - data source). So basically, knowing the # of organizations that participated in three surveys, or just two or one.

    In the end, a new dataset from the master dataset of organizations based on data source based on the organizations which took all three surveys, or two, etc.

    Thanks in advance!
    Last edited by abhishek bhati; 01 Jun 2021, 13:34.

  • #2
    Something like:

    Code:
    isid org_name datasource
    preserve
    contract ID
    *ALL 3
    list if _freq==3
    *TWO
    list if _freq==2
    restore

    Comment


    • #3
      Its giving me an error message that org_name is not uniquely defined which it is not. The unique identifier is ID. I was wondering rather than listing the ID, could I create a new dataset of organizations that participated in all three surveys. Output like
      9089081 "Save Children" "Survey1"
      9089081 "Save Children" "Survey2"
      9089081 "Save Children" "Survey3"

      Comment


      • #4
        If the following does not do it, you need to give a data example using dataex as your description will appear not to match the actual dataset. See FAQ #12 for more details.

        Code:
        preserve
        keep ID org_name datasource
        duplicates drop *, force
        bys ID: drop if _N<3
        l, sepby(ID)
        restore

        Comment


        • #5
          Thanks, Andrew. The code is working and lists the observation. However, is it possible to drop rest of the observations rather than listing them. I am interested in creating a new dataset.

          Comment


          • #6
            Code:
            forval i=1/3{
                bys ID: egen in`i'= max(datasource=="Survey`i'")
            }
            preserve
            keep if in1&in2&in3
            save all3, replace
            restore
            use all3, clear

            Comment

            Working...
            X