Hi all,
For my Master's thesis, I have decided to be ambitious and create an index. It is a bit complex, so I will explain the background of my problem.
I am studying how employee characteristics affect firm performance. For that, I currently have 2 files: 1 with firm-level data (variables such as assets, debt...) structured as panel data. Another file with employee-level data (it contains variables measuring what firm the employee works for, their gender, what education they had, how many years of expertise, and how many simultaneous jobs they have).
My goal is to merge these 2 datasets and end up with firm-level data, with the help of an index. This is where my problem comes.
My employee-level dataset is currently structured as follows:
Where gender is a dummy (0 = male, 1=female). For my index I would like to create a variable called Index. To create it, I would need to transform the data from employee-level, to firm-level (i.e. have 1 observation per firm and year, instead of per employee and company). To do this, I would need to take into account that there are many employees working for one company (CompanyID).
My Index variable should be an aggregate of the different characteristics of all employees working for it. I would like to create a rule so that:
I would end up with something like this (I am making these values up):
Does anyone have an idea of how I can do this?
Thank you very much in advance!
Carla
For my Master's thesis, I have decided to be ambitious and create an index. It is a bit complex, so I will explain the background of my problem.
I am studying how employee characteristics affect firm performance. For that, I currently have 2 files: 1 with firm-level data (variables such as assets, debt...) structured as panel data. Another file with employee-level data (it contains variables measuring what firm the employee works for, their gender, what education they had, how many years of expertise, and how many simultaneous jobs they have).
My goal is to merge these 2 datasets and end up with firm-level data, with the help of an index. This is where my problem comes.
My employee-level dataset is currently structured as follows:
EmployeeID | CompanyID | Year | Gender | Simult_Jobs | Years_exper |
1 | 1 | 2000 | 1 | 2 | 5 |
1 | 2 | 2000 | 1 | 2 | 5 |
1 | 1 | 2001 | 1 | 1 | 2 |
2 | 3 | 2003 | 0 | 1 | 10 |
2 | 3 | 2004 | 0 | 1 | 20 |
3 | 1 | 2000 | 1 | 3 | 3 |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
Where gender is a dummy (0 = male, 1=female). For my index I would like to create a variable called Index. To create it, I would need to transform the data from employee-level, to firm-level (i.e. have 1 observation per firm and year, instead of per employee and company). To do this, I would need to take into account that there are many employees working for one company (CompanyID).
My Index variable should be an aggregate of the different characteristics of all employees working for it. I would like to create a rule so that:
- The value of the Index variable increases in 1 unit if the employee is female, 0 if male.
- The value of the Index variable increases in 2 units if the employee works in 3 or less jobs simultaneously, 1 if 4 or 5, and 0 if more than 5.
- The value of the Index variable increases in 2 units when the employee has 10 or more years of experience, 1 if between 5 and 9, and 0 if less than 5.
I would end up with something like this (I am making these values up):
CompanyID | Year | Index |
1 | 2000 | 5 |
1 | 2001 | 3 |
1 | 2002 | 4 |
2 | 2000 | 3 |
2 | 2001 | 2 |
2 | 2002 | 5 |
Does anyone have an idea of how I can do this?
Thank you very much in advance!
Carla
Comment