identifying id's for panel data

Guest
#1

identifying id's for panel data

19 Jan 2022, 07:41

Dear all,

currently I'm working on bilateral trade data for Germany. I have export and import values for Germany with 25 other countries within sectors. I have almost 900 sectors. I wanna regress bilateral German export on exchange rate and GDP with the given country. Therefore I need to specify sectors as a unique id. E.g I have data on apparel sector in 2001 for Austria. I have data on the same sector in the same year for Turkey an so on. To deal with this issue I encoded countries as numbers, so that AUS is 1 and USA is 25. Sectors are described with numbers, Some of them are 6-digits. Therefore I multiplied country number times 1000000 and added that to the sector. However it turns out stata rounds up big numbers. E.g. I have sector 610339 in 2001 for Turkey. Stata calculated the id for it as 24610340. I have sector 610341 in 2001 for TUR and stata calculated it as 24610340 again. Please help me
Tags: None
Øyvind Snilsberg

Join Date: Oct 2021

Posts: 591
#2

19 Jan 2022, 08:14

Code:

egen id = group(country sector)

https://www.stata.com/support/faqs/d...p-identifiers/
1 like
Comment
Guest
#3

19 Jan 2022, 08:35

Thank you! It works. After few hours browsing the Internet and trying different approaches I am finally one little step forward
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

19 Jan 2022, 11:44

Backing up to post #1, the problem with the approach you took is one of precision, not of Stata rounding.

For a quick explanation, see the FAQ at

http://www.stata.com/support/faqs/da...-point-values/

For more, see the output from help precision.

Here is an example that uses what you learn from those sources to make your approach work.

Code:

. set obs 1
Number of observations (_N) was 0, now 1.

. generate country = 24

. generate sector = 610339

. generate float id_f = country*1000000 + sector

. generate long id_l = country*1000000 + sector

. generate double id_d = country*1000000 + sector

. format %10.0f id*

. list, noobs

  +---------------------------------------------------+
  | country   sector       id_f       id_l       id_d |
  |---------------------------------------------------|
  |      24   610339   24610340   24610339   24610339 |
  +---------------------------------------------------+

Here are the limits on storage of decimal integers with full accuracy in the various numeric storage types. The fixed-point variables lose the 27 largest positive values to missing value codes; the similar loss for floating point variables occurs only for the largest exponent, so it doesn't affect the much smaller integer values.

byte - 7 bits	-127	100
int - 15 bits	-32,767	32,740
long - 31 bits	-2,147,483,647	2,147,483,620
float - 24 bits	-16,777,216	16,777,216
double - 53 bits	-9,007,199,254,740,992	9,007,199,254,740,992

Announcement

identifying id's for panel data

Comment

Comment

Comment