Hi,
I have a conceptual question regarding panel data analysis.
Consider the following panel data where Firm_ID identifies a firm, CEO_ID identifies the respective CEO in place during the fiscal year (CC_FY). Each firm in my sample underwent exactly one CEO change. NL represents a psychological construct of the CEO, and NL_Median refers to the median of this construct over the CC_FY for a given CEO in a firm. I want to understand how NL is correlated within a firm across different CEOs based on NL_Median.
Please ignore that some observations refer to the same CC_FY, producing duplicates.
My initial idea was to create a variable CNS_Median_First, which holds the NL_Median values for the first CEO, and a variable CNS_Median_Second, which holds the NL_Median values for the second CEO of each firm. Then, I could calculate the correlation between these two variables by Firm_ID, i.e., within each specific firm. This would give me a set of correlation coefficients.
However, I'm unsure how to proceed from here. I feel that simply taking the average of all these correlation coefficients might not be sufficient.
Any advice on how to properly analyze this correlation would be greatly appreciated.
Thank you!
I have a conceptual question regarding panel data analysis.
Consider the following panel data where Firm_ID identifies a firm, CEO_ID identifies the respective CEO in place during the fiscal year (CC_FY). Each firm in my sample underwent exactly one CEO change. NL represents a psychological construct of the CEO, and NL_Median refers to the median of this construct over the CC_FY for a given CEO in a firm. I want to understand how NL is correlated within a firm across different CEOs based on NL_Median.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double Firm_ID int CC_FY long CEO_ID float(NL NL_Median) 4295899290 2010 18101 -1.0617353 -1.0617353 4295899290 2010 18101 -1.0617353 -1.0617353 4295899290 2015 35166 .7557002 1.1048557 4295899290 2013 35166 1.4971353 1.1048557 4295899290 2013 35166 1.4971353 1.1048557 4295899290 2013 35166 1.4971353 1.1048557 4295899290 2013 35166 1.4971353 1.1048557 4295899290 2012 35166 -1.130444 1.1048557 4295899290 2014 35166 1.1048557 1.1048557 4295899290 2012 35166 -1.130444 1.1048557 4295899290 2014 35166 1.1048557 1.1048557 4295899290 2014 35166 1.1048557 1.1048557 4295899290 2014 35166 1.1048557 1.1048557 4295899290 2012 35166 -1.130444 1.1048557 4295899323 2011 30552 -.66134 -.66134 4295899323 2011 30552 -.66134 -.66134 4295899323 2011 30552 -.66134 -.66134 4295899323 2017 31112 -2.9109335 -2.9109335 4295899323 2015 31112 -2.444447 -2.9109335 4295899323 2014 31112 -2.857669 -2.9109335 4295899323 2017 31112 -2.9109335 -2.9109335 4295899323 2016 31112 -3.69725 -2.9109335 4295899323 2014 31112 -2.857669 -2.9109335 4295899323 2016 31112 -3.69725 -2.9109335 4295899323 2016 31112 -3.69725 -2.9109335 4295899323 2017 31112 -2.9109335 -2.9109335 4295899323 2018 31112 -2.3553092 -2.9109335 end
My initial idea was to create a variable CNS_Median_First, which holds the NL_Median values for the first CEO, and a variable CNS_Median_Second, which holds the NL_Median values for the second CEO of each firm. Then, I could calculate the correlation between these two variables by Firm_ID, i.e., within each specific firm. This would give me a set of correlation coefficients.
However, I'm unsure how to proceed from here. I feel that simply taking the average of all these correlation coefficients might not be sufficient.
Any advice on how to properly analyze this correlation would be greatly appreciated.
Thank you!