Dividing each column (variable) by all the others, and generating the quotient as new variables

Andrew Bernal

Join Date: Feb 2022

Posts: 31
#1

Dividing each column (variable) by all the others, and generating the quotient as new variables

08 Feb 2022, 11:24

Hello! I am working on this project, and I need to figure out how to loop this command correctly. I have daily exchange rate data for 80+ countries, all of which are with respect to USD (i.e. EUR/USD, JPY/USD, etc). I would like to create new variables which would allow me to get all exchange rate data for every bilateral pair in my dataset. Obviously, this would be done by dividing the EUR/USD by CNY/USD, which would result in the EUR/JPY exchange rate. Ideally, I would like to create the new variables with appropriate names, i.e. for the example above, the new variable name would be EUR_CNY. I am unsure if this is possible to do with a foreach loop. Sample data below:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str8 timeperiod float(ars eur aud cny) "1/3/2005" 2.9762344 .740357 1.282964 8.276967 "1/4/2005" 2.9779274 .748223 1.294426 8.276992 "1/5/2005" 2.9794314 .756201 1.312311 8.276996 "1/6/2005" 2.9735265 .758553 1.313358 8.277023 "1/7/2005" 2.977273 .757576 1.310682 8.27697 end

It would likely be necessary to get both EUR/CNY and CNY/EUR. Thanks in advance.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#2

08 Feb 2022, 11:46

Code:

local currencies eur aud cny foreach c1 of local currencies { foreach c2 of local currencies { if `"`c1'"' != `"`c2'"' { gen `c1'_by_`c2' = `c1'/`c2' } } }

As an aside, your variable timeperiod, as a string variable, is going to be useless. You need to convert it to a Stata internal format daily date if you are going to use it in any kind of analysis. If you are not familiar with Stata datetime functions, at the very least read -help daily()- and -help datetime display formats-. And if you are going to be working with this kind of data in Stata on an ongoing basis, you should invest some time reading the entire Datetime chapter in the [D] volume of the PDF documentation that is installed with your Stata. It's a lot to learn, and you won't remember every detail, but your time will be amply and quickly repaid.

Thank you for using -datex- on your first post!!

Added: Inferring exchange rates in this way presumes that the exchange rates remain perfectly "synchronized." But I am told that there is an arbitrage market in currency exchange, so presumably that doesn't actually happen.

Last edited by Clyde Schechter; 08 Feb 2022, 11:48.
3 likes
Comment
Andrew Bernal

Join Date: Feb 2022

Posts: 31
#3

08 Feb 2022, 13:11

Thanks very much Clyde. In my actual data set, I have converted the dates to the correct time-series format, but thanks for that!

I have ~80 countries to consider. So, can my first line read to be something as simple as:

Code:

local currencies ds

And in this case, will the foreach loop still work? I'm a little more familiar using for loops in Python, so it seems strange to me that this would work without using "i" or "j" to count the variables.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#4

08 Feb 2022, 13:36

Well, I can understand not wanting to type out a list of 80 countries to create local macro currencies. If there is some wildcard expression you can use to get those variable then you can do that with the -ds- command. For example, suppose the currencies are a consecutive block of variables in your data set starting with eur and ending with, say, jpy (Japanese Yen). Then you could do:

Code:

ds eur-jpy local currencies `r(varlist)'

If they are not a consecutive block, then it's harder. If they are several consecutive blocks, with a few interruptions by other variables, using the -order- command to turn them into a consecutive block might be the way to go. If they are really scattered around the data set and cannot be meaningfully abbreviated with wildcards, then I don't see an easy solution. Perhaps there is some data set you can find online that lists these and you could import the names from that instead.

Once you create the local macro currencies, the rest of my solution from that point on will do the job for you without any changes. Note that you will end up with a little under 1,700 new variables created. That data set may prove difficult to work with, but I suppose you'll cross that bridge when you come to it.

I'm a little more familiar using for loops in Python, so it seems strange to me that this would work without using "i" or "j" to count the variables.

Well, Stata is not Python. It is also not C, nor C++, nor any other language. You could solve this problem with counters used as pointers to the variables in the local macro, but it would be about twice as long (and hence twice as many opportunities to make a mistake) and would also run more slowly (though that might not be noticeable on a problem this size). But there is no reason to do that. The -foreach- command's ability to iterate over lists of variables, contents of local macros, lists of numbers, and arbitrary lists is one of Stata's best features. I encourage you to get familiar with them. You will never want to go back. (A big other chunk of my regular work involves programming in C++, and I routinely curse the inability to iterate over names of variables in this way. Of course, doing that flexibly is, to say the least, difficult to implement in a compiled language, so it is a reasonable price to pay in situations where execution speed is critical.) To be sure, there are some situations where you need to use counters in Stata, but they are relatively uncommon and when you encounter them, you appreciate, through its absence, the simplicity of -foreach-.

Last edited by Clyde Schechter; 08 Feb 2022, 13:39.
1 like
Comment
Andrew Bernal

Join Date: Feb 2022

Posts: 31
#5

09 Feb 2022, 09:21

Luckily, they are in a consecutive block! So this code just worked perfectly, thanks! Also, by my count, a little over 6000 new variables were created. Due to the nature of my project, I may not need all this information. In the above example, EUR/CNY and CNY/EUR are both created; however, I'll mainly be interested in daily percentage changes, so these values will end up being equivalent. I tried the below code to try to prevent this from happening, but I must be missing something. Any ideas?

Code:

ds ars-cny local currencies `r(varlist)' foreach c1 of local currencies { foreach c2 of local currencies { if `"`c1'"' != `"`c2'"' & 1/(`c2'/`c1') != `c1'/`c2'{ gen `c1'_by_`c2' = `c1'/`c2' } } }

If the bilateral pair already exists in some way, I don't need to generate a new variable. This, however, isn't quite working and I'm not sure why. Since ARS is the first in the list, I would think it would be able to generate ARS_EUR, ARS_AUD, ARS_CPY, and then EUR_AUD, EUR_CPY, and finally AUD/JPY.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29792
#6

09 Feb 2022, 10:54

In #1 you said

It would likely be necessary to get both EUR/CNY and CNY/EUR.

I gather that is no longer the case.

Under this circumstance, looping over integers as pointers is actually the best approach. It goes like this:

Code:

ds ars-cny local countries `r(varlist)' local n_countries: word count `countries' forvalues i = 1/`n_countries' { local c1: word `i' of `countries' forvalues j = `=`i'+1'/`n_countries' { local c2: word `j' of `countries' gen `c1'_by_`c2' = `c1'/`c2' } }
Comment
Andrew Bernal

Join Date: Feb 2022

Posts: 31
#7

09 Feb 2022, 12:49

I believe it best that I have both approaches available, just in case. This works perfectly, thanks again!
Comment

Announcement

Dividing each column (variable) by all the others, and generating the quotient as new variables

Comment

Comment

Comment

Comment

Comment

Comment