Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Restricting data to specific observations

    Hello everyone,

    I am working with a large dataset (4,682 variables, 5,726 observations) of students attending U.S. schools. Essentially, I want to limit the data to have students who experienced bullying or cyberbullying during the school year. Thus, I only want to count students who answered "yes" (coded 1) to any of the seven particular bullying-related or cyberbullying-related questions and exclude those not bullied from the dataset.

    I used the generate command to construct a variable with the value 1 if the value of the various bullying variables is equal to 1 and did the same thing for cyberbullying.
    Code:
    gen bullyvic=1 if VS0073==1|VS0074==1|VS0075==1|VS0076==1|VS0077==1|VS0078==1|VS0079==1
    Code:
    tab bullyvic
    Total=1,017
    Code:
    gen cyberbullyvic=1 if VS0097==1|VS0156==1|VS0098==1|VS0099==1|VS0100==1|VS0101==1|VS0102==1
    Code:
    tab cyberbullyvic
    Total=330

    Code:
     tab bullyvic cyberbully
    Total: 253

    However, I do not know how to continue with restricting the data to have the sample of 1,094 students who experienced bullying, cyberbullying, or both.

    I tried using these commands
    Code:
    drop if
    and
    Code:
    keep if
    However, they end up deleting all cases other than those that said "yes" to the variable in question, which is a problem if a respondent said "yes" to more than one variable meaning that they encountered multiple bullying behaviors. Is there a way to sum all the "yes" responses for each of the variables listed above to create a new dataset?

    Many thanks for considering my request.


    Last edited by Raven Lewis; 03 Feb 2022, 00:22.

  • #2
    Raven:
    welcome to this forum.
    You may want to try something along the following lines:
    Code:
    . set ob 2
    Number of observations (_N) was 0, now 2.
    
    . g id=_n
    
    . g A=1
    
    . g B=1
    
    . g C=1 in 1
    (1 missing value generated)
    
    . replace C=0 in 2
    (1 real change made)
    
    . egen wanted=rowmean(A B C)
    
    . list
    
         +---------------------------+
         | id   A   B   C     wanted |
         |---------------------------|
      1. |  1   1   1   1          1 |
      2. |  2   1   1   0   .6666667 |
         +---------------------------+
    
    .
    Values>0 mean that student has encountered at least one episode of bullying behavior.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment

    Working...
    X