Restricting data to specific observations

Raven Lewis

Join Date: Feb 2022

Posts: 1
#1

Restricting data to specific observations

03 Feb 2022, 00:12

Hello everyone,

I am working with a large dataset (4,682 variables, 5,726 observations) of students attending U.S. schools. Essentially, I want to limit the data to have students who experienced bullying or cyberbullying during the school year. Thus, I only want to count students who answered "yes" (coded 1) to any of the seven particular bullying-related or cyberbullying-related questions and exclude those not bullied from the dataset.

I used the generate command to construct a variable with the value 1 if the value of the various bullying variables is equal to 1 and did the same thing for cyberbullying.

Code:

gen bullyvic=1 if VS0073==1|VS0074==1|VS0075==1|VS0076==1|VS0077==1|VS0078==1|VS0079==1

Code:

tab bullyvic

Total=1,017

Code:

gen cyberbullyvic=1 if VS0097==1|VS0156==1|VS0098==1|VS0099==1|VS0100==1|VS0101==1|VS0102==1

Code:

tab cyberbullyvic

Total=330

Code:

tab bullyvic cyberbully

Total: 253

However, I do not know how to continue with restricting the data to have the sample of 1,094 students who experienced bullying, cyberbullying, or both.

I tried using these commands

Code:

drop if

and

Code:

keep if

However, they end up deleting all cases other than those that said "yes" to the variable in question, which is a problem if a respondent said "yes" to more than one variable meaning that they encountered multiple bullying behaviors. Is there a way to sum all the "yes" responses for each of the variables listed above to create a new dataset?

Many thanks for considering my request.

Last edited by Raven Lewis; 03 Feb 2022, 00:22.
Tags: data

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17598

03 Feb 2022, 01:18

Raven:
welcome to this forum.
You may want to try something along the following lines:

Code:

. set ob 2
Number of observations (_N) was 0, now 2.

. g id=_n

. g A=1

. g B=1

. g C=1 in 1
(1 missing value generated)

. replace C=0 in 2
(1 real change made)

. egen wanted=rowmean(A B C)

. list

     +---------------------------+
     | id   A   B   C     wanted |
     |---------------------------|
  1. |  1   1   1   1          1 |
  2. |  2   1   1   0   .6666667 |
     +---------------------------+

.

Values>0 mean that student has encountered at least one episode of bullying behavior.

Kind regards,
Carlo
(StataNow 18.5)

Announcement

Restricting data to specific observations

Comment