Balancing panel dataset for country pairs by expanding for missing data

Rabih El Habta

Join Date: Jan 2019

Posts: 36
#1

Balancing panel dataset for country pairs by expanding for missing data

18 Apr 2019, 13:21

Hi all,

I have an unbalanced panel dataset consisting of 217 exporting countries, 172 importing countries, and for 30 years.
I need to expand it by filling missing years, exporters, and importer, to get a balanced panel with a total of 30*217^2 observations.
I tried the command as in
https://www.statalist.org/forums/for...783#post843783
but STATA replies "NOT SORTED"
Kindly find below an example of my dataset where there is 3 exporters, 4 importers, and 4 years.
Basically, I want first to get 4^3= 64 observations (48 observations between country pairs and 16 between themselves), and second to fill exporter's and importer's names for the corresponding missing data.
any help is highly appreciated, thank you so much.

* Example generated by -dataex-. To install: ssc install dataex
clear
input str30(exporter importer) long year double simple float pair_id
"Afghanistan" "Australia" 2014 2.55 1
"Afghanistan" "Australia" 2015 1.68 1
"Afghanistan" "Australia" 2016 3.39 1
"Afghanistan" "Australia" 2017 2.72 1
"Afghanistan" "Chile" 2015 6 2
"Afghanistan" "Chile" 2016 6 2
"Afghanistan" "Chile" 2017 6 2
"Afghanistan" "United States" 2014 3.58 3
"Afghanistan" "United States" 2015 3.45 3
"Afghanistan" "United States" 2016 3.46 3
"Afghanistan" "United States" 2017 2.75 3
"Australia" "Australia" 2014 3.23 4
"Australia" "Australia" 2015 2.94 4
"Australia" "Australia" 2016 3.04 4
"Australia" "Australia" 2017 2.97 4
"Australia" "Chile" 2015 6 5
"Australia" "Chile" 2016 6 5
"Australia" "Chile" 2017 6 5
"Australia" "United States" 2014 3.52 6
"Australia" "United States" 2015 3.45 6
"Australia" "United States" 2016 3.46 6
"Australia" "United States" 2017 3.24 6
"Chile" "Australia" 2014 3.07 7
"Chile" "Australia" 2015 2.81 7
"Chile" "Australia" 2016 2.92 7
"Chile" "Australia" 2017 2.93 7
"Chile" "Chile" 2015 6 8
"Chile" "Chile" 2016 6 8
"Chile" "United States" 2014 3.28 9
"Chile" "United States" 2015 3.16 9
"Chile" "United States" 2016 3.18 9
"Chile" "United States" 2017 6.16 9
end
[/CODE]
------------------ copy up to and including the previous line ------------------

Listed 32 out of 32 observations

400 Bad Request

https://www.statalist.org
Tags: None
David Beheshti

Join Date: Mar 2018

Posts: 35
#2

18 Apr 2019, 19:54

The below code snippet gives a full panel with pair_id as the panel variable:

Code:

xtset pair_id year tsfill, full bysort pair_id: egen temp = mode(exporter) replace exporter = temp if missing(exporter) drop temp bysort pair_id: egen temp = mode(importer) replace importer = temp if missing(importer) drop temp

However, this only gives 36 observations as opposed to the 64 you want. Could you explain what other observations you want to be in the dataset that aren't there now?
1 like
Comment
Rabih El Habta

Join Date: Jan 2019

Posts: 36
#3

19 Apr 2019, 00:57

Thank you, David

In the data sample I provided, I have three exporters e.g. ( Afghanistan, Australia, and Chile) and four importers e.g. (same countries and the USA).
I need each country pair to have 4 observations as follow:
AFG AFG from 2014 to 2017
AFG USA from 2014 to 2017
AFG AUS from 2014 to 2017
AFG Chile from 2014 to 2017
thus for every importer, I need to have 16 observations.
in the case of Afganistan, 4 observations are missing and I need Stata to generate these missing observations e.g. AFG AFG from 2014 to 2017
while for Australia and Chile 4 observations each, e.g. Chile AFG and AUS AFG from 2014 to 2017
whereas for the USA 16 observations are missing, so I need to generate these missing observations as explained above.
At the end, I will have 4 exporters, 4 importers and 4 years, which gives 64 observations.
These are the data that are not there.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17672
#4

19 Apr 2019, 01:58

Rabin:
In general, your approach is not recommended.
Stata can handle both unbalanced and balanced panel with no problem.
More substantively, the way you fill missing values affect your subsequent inference; if the missing mechanism is not ignorable and you skip modeling it, the results you get are far from being reliable.

Last edited by Carlo Lazzaro; 19 Apr 2019, 02:26.

Kind regards,
Carlo
(StataNow 19.0)
1 like
Comment
Rabih El Habta

Join Date: Jan 2019

Posts: 36
#5

19 Apr 2019, 02:29

Thanks, Carlo,

However, after expanding my dataset with the missing observations I will fill them with data from available resources, but I need to get this first done.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17672
#6

19 Apr 2019, 02:40

Rabin:
the main issue is if the dataset thet you get after filling in the missing values according to your approach, resembles (or not) your original sample.

Kind regards,
Carlo
(StataNow 19.0)
1 like
Comment

Announcement

Balancing panel dataset for country pairs by expanding for missing data

Comment

Comment

Comment

Comment

Comment