I have a dataset with variables cusip, cusip_, and sic. cusip and cusip_ have the same values, but are arranged so they can be put into a correlation matrix where each pair (cusip, cusip_) shows up once. Variable sic depends on the value in cusip_. A snapshot would look like:
I want to generate a new variable sic (included above) which corresponds to cusip, not cusip_. I want to somehow label each of the 609 values of cusip_ by the corresponding sic_, and map the sic_ to the same values in cusip. And the match will not be perfect, as there are 810 values in cusip, some of which are not in cusip_, and in those cases I'd like the entry to be a missing observation. The data makes a 810 x 609 matrix of cusip x cusip_.
Do you have any suggestions? Thank you in advance for the help! And if my question seems basic, I also apologize, as I'm relatively new to Stata and have not yet found a past question that addresses this.
cusip | cusip_ | sic_ | sic |
2 | 1 | 10 | 20 |
1 | 1 | 10 | 10 |
3 | 1 | 10 | 30 |
... | ... | ||
1 | 2 | 20 | 10 |
2 | 2 | 20 | 20 |
3 | 2 | 20 | 30 |
Do you have any suggestions? Thank you in advance for the help! And if my question seems basic, I also apologize, as I'm relatively new to Stata and have not yet found a past question that addresses this.