Dividing the dataset into two subsets

Badar Khalid

Join Date: Oct 2015

Posts: 58
#1

Dividing the dataset into two subsets

07 Aug 2016, 14:38

Hello Dear Statalist Users,

I have data set which contain around 15 countries and I have classified them legal system of the country (civil =0 and common =1).

I would like to run the regression separately for civil law and common law countries separately, Can you please let me know the code to divide the two subsets while I run the following regression:

Code:

xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID)
Tags: None

Mari Meir

Join Date: Jul 2016
Posts: 61

07 Aug 2016, 16:03

Hello,

I would do this:

Code:

 
 xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==0    
 xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==1

Does it work for you?

Comment

Badar Khalid

Join Date: Oct 2015

Posts: 58
#3

07 Aug 2016, 16:39

Thank you so much Mari Meri, it works well.

Can you please let me know the code if I want to have subset based on two filtering in the same regression. As I have many countries with civil law system, For example, what will be the code if I want civil law and just for France observations (please note that I have dummy variable for France observation, 0 or 1).

Tank you.
Comment

Mari Meir

Join Date: Jul 2016
Posts: 61

08 Aug 2016, 01:26

Hello, Badar Khalid, Every time you need to include more filtering, you just need to add "&" after each restriction and add the new restriction.

Code:

xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==0 & France==1     xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==1 & France==1 xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==1 & France==1 & restriction3=0 & restriction4<=100 & restriction5>3500

Best,

MM

Comment

Mari Meir

Join Date: Jul 2016
Posts: 61

08 Aug 2016, 01:28

Sorry, made a mistake on the last post. Please consider this.

Code:

 
 xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==0 & France==1      xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==1 & France==1  xtreg BoardIndependence L.(LogTotalAsset SalesGrowthRate),fe robust cluster (CompanyID) if civil==1 & France==1 & restriction3=0 & restriction4<=100 & restriction5>3500

Comment

Mari Meir

Join Date: Jul 2016

Posts: 61
#6

08 Aug 2016, 01:29

(the big spaces are supposed to be "enters" to go to the next line... Anyway, you got it... )
Comment
Badar Khalid

Join Date: Oct 2015

Posts: 58
#7

08 Aug 2016, 05:12

Dear Mari Meir - Thank you so much for providing me the suitable codes for my quiery . That works fine.

Another query with regard, excluding one country in the sample, for example, I have 15 countries in my sample and I would like to run the regression excluding one country, let us say UK and I have it as dummy in my sample (1 or 0)

Is there code for excluding the country or I need to mention all the remaining 14 countries at the end of the command as we did above. Thank you in advance.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#8

08 Aug 2016, 05:55

Badar: It's a better idea if you study the help for if as otherwise you will be coming back to the forum with every minor variation of this question.

That help gives these examples:

Code:

. sysuse auto . list make mpg if mpg>25 . list make mpg if mpg>25 & mpg<30 . list make mpg if mpg>25 | mpg<10 . regress mpg weight displ if foreign==1

Code:

Already we've illustrated using if for inclusion with simple and compound conditions. The help points to the help for operators which tells about != for not equals. Hence

Code:

. list make mpg if _n != 42

would be a way to exclude observation 42.

In your case if the indicator variable were called UK then

Code:

... if !UK ... if UK == 0 ... if UK != 1

would be equivalent.

In short,

Code:

help if help operators

Last edited by Nick Cox; 08 Aug 2016, 06:24.
1 like
Comment
Badar Khalid

Join Date: Oct 2015

Posts: 58
#9

08 Aug 2016, 06:21

Dear Nick, Thank you for your advice. The code works fine with me, really appreciated.
Comment
Thanh Trinh

Join Date: Nov 2019

Posts: 6
#10

22 Nov 2019, 10:29

Dear Statalisters,

I am currently using Stata version 16.0. With regard to my thesis, I am examining the impact of the Legal System (common law versus civil law) on the absolue forecast error (EPA) and the forecast dispersion (DISP). To do so, I ran a pooled regression:

Code:

regress EPA LegalSyst LnSize Cover Loss Flev Roe pwcorr EPA LegalSyst LnSize Cover Loss Flev Roe

I have 628 firms for 5 years and 16 countries and my models are as follows:

Model 1: EPA = β₀ + β₁*LegalSyst + β₂*LnSize+ β₃*Cover + β₄*Loss + β₅*Flev + β₆*Roe
Model 2: DISP = β₀ + β₁*LegalSyst + β₂*LnSize+ β₃*Cover + β₄*Loss + β₅*Flev + β₆*Roe

Where : LegalSyst and Loss are my dummy variables

I wanted to go further into my analysis and tried to create a panel data with country and year fixed effects. In order to determine whether I should use fixed or random effect, I ran a Hausman and Pesaran's test with the following lines:

Code:

egen EnterpriseID= group(Enterprise) xtset EnterpriseID Year xtreg EPA LegalSyst LnSize Cover Loss Flev Roe, fe estimates store Fixed

Code:

xtreg EPA LegalSyst LnSize Cover Loss Flev Roe, re estimates store Random

Code:

hausman Fixed

According to the Hausman's test, the fixed effect model is appropriate since prob < 5%.

Code:

ssc install xtcsd xtcsd, pesaran abs

According to the Pesaran's test, there is a serial correlation since prob < 5%.

I don't really know if those codes/tests are relevant for my robustness test. But I kinda feel that I should use fixed effect. I have seen alternatives to implement fixed effect which are :

Code:

xtreg EPA LegalSyst LnSize Cover Loss Flev Roe i.CountryID##i.Year, fe regress EPA LegalSyst LnSize Cover Loss Flev Roe i.CountryID

My questions are as follows:

a) What are the adequate codes to implement the fixed effect for (1) country, (2) year, and (3) country-year ?
b) How should I proceed if I want to compare distinct Country or a combination of countries ?

I would very appreciate if you could help me. Thank you!

Kind regards,

Thanh Trinh.
Comment

Announcement

Dividing the dataset into two subsets

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment