Kolmogorov – Smirnov Test for panel database

Guido Guerra

Join Date: Feb 2018

Posts: 9
#1

Kolmogorov – Smirnov Test for panel database

29 Feb 2020, 04:49

Hello!

I want to perform the Kolmogorov –Smirnov test to a panel database given that it is a frequently employed method within the literature. I have to simple questions:
1) My panel database includes from the whole population 70% of firms with more than 200 workers while less than 5% of firms with less than 200 workers. Can I use this database?
2) I want to check if exporting firms are more productive than non-exporting firms. However, firm j can be non-exporters in year t, become exporter in year t+1, and become non-exporters in year t+2. Can I compare the productivity between the group of firms that are exporters vs non-exporters, if firms switch status over time? If so, do I need any specific command for the K-S test?

Thank you very much for your time.

Last edited by Guido Guerra; 29 Feb 2020, 04:52.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17673
#2

29 Feb 2020, 07:16

Guido:
admittedly I've never come across KS test in panel data regression: hence I cannot say. Are you interested in testing whether exporters vs non-exporters are equally distributed?
1) In my opinion you can use that database if it gives a true and fair representation of the population from which your sample was drawn.
2) Provided that my last experience with international trade dates back to 30 years ago (hence, I do not know the literature in that research field), from your description it sounds like you're interested in investigating whether exporting or not (which is not a stable feature of your sample of companies) contributes, when adjusted for the other predictors, to explain variation in company productivity. If that were the case, I would simply add a categorical predictor (say, -i.export-; see -help fvvarlist-) that takes on 0 if the company does not export in a given wave and takes on 1 if the opposite happens.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Guido Guerra

Join Date: Feb 2018

Posts: 9
#3

06 Mar 2020, 05:25

Hello Carlo,

thank you very much for your clear answer. Indeed, as you propose, I previously use the regression with a dummy variable for the export status of the firm and some control variables. However, I would like to perform an additional test. As per Delgado, Fariñas and Ruano (2002) I am testing with the SK test. Also I am trying to use a very interesting and eye-catching graphic which compares the cumulative distribution function of the productivity level between exporters and non-exporters.

¨To further illustrate the comparisons between different groups of ﬁrms we have graphed estimates of the distribution functions. In particular, we have computed the smooth, or perturbed, sample distribution function, rather than the sample distribution function itself, which provides nice smooth distribution estimates. The smooth sample distribution estimator was proposed by Nadaraya (1964). Since the purpose here is to produce graphical representations of the differences between two groups of ﬁrms, we represent these distributions for the whole population of ﬁrms. Consider for that purpose the distribution F (.), which corresponds to the t productivity of, say, exporting ﬁrms, and F (.| r = r0 ), r0={0,1} , which denotes the conditional distribution function for a given size group of ﬁrms, small ( r=0) or large ( r=1). The selective sampling scheme used in our data set implies that only these conditional cumulative distribution functions, F (.| r = r0 ), can be estimated directly. However, the cumulative distribution function for the whole population of exporters can be obtained by the following expression:
Ft (.) = Pt (r = 0) X Ft(.| r = 0 ) +Pt (r = 1) X Ft(.| r = 1 )
where P (.) represents the probability of being either a small or a large ﬁrm in the considered group of exporting ﬁrms. This expression indicates that the cumulative distribution function for the whole population of ﬁrms can be estimated as a weighted average of the two conditional cumulative distribution functions.Marginal probabilities can be calculated from the information provided by the sample. The estimation of marginal probabilities for the population of ﬁrms takes into account the sampling proportions of the data set. As indicated, the sampling proportion is 0.05 for small ﬁrms and 0.7 for large ﬁrms. Therefore, for any group of ﬁrms, say exporters, the number of large and small ﬁrms can be estimated multiplying the number of ﬁrms in the sample by the inverse of the sampling proportion. This procedure permits the calculation of relative frequencies and therefore the estimation of marginal probabilities of being either a small or a large ﬁrm. In particular, for the group of non-exporting ﬁrms, the estimated probability of being small is Pt ( r =0)=0.993 and the probability of being large is Pt ( t =1)= 0.007. For the group of exporting ﬁrms, these probabilities are Pt (r =0)=0.924 and Pt ( t =1)=0.076."

I am able to obtain the cumulative productivity distribution functions graphic. However, from this text and from reading the information provided in PDFs about smooth samples and reading the information in the forum about it, it is still unclear to me what command or code I need to write in order to perform the graphical representation as proposed by Delgado et al. (2002).

Does anyone have any suggestion or idea?
Comment

Announcement

Kolmogorov – Smirnov Test for panel database

Comment

Comment