Hello,
I find myself grappling with a dilemma in my current research project and would greatly appreciate some guidance on how to proceed.
In essence, my project delves into examining the correlation between the psychological trait of narcissism among firm CEOs and some linguistic features during quarterly conference calls. These calls serve as a platform for discussing the preceding quarter's firm results. One of my primary hypotheses posits a positive relationship between CEO narcissism and the tone of their speech. Specifically, I propose that more narcissistic CEOs employ a higher frequency of positive words (more positive tone) during their speeches compared to their less narcissistic counterparts.
To quantify this, I extract CEOs' speech from conference call transcripts, categorize words as positive or negative using a specialized dictionary, and then calculate the tone measure. The dependent variable is expressed as: (number of positive words - number of negative words) divided by the total number of words.
Measuring psychological traits can be challenging, and I rely on a measurement based on archival data specifically developed to capture narcissism in the business context. This measurement comprises 15 items (e.g., compensation of the CEO, no. of awards a CEO received, use of corporate private jet...) which I combine using Principal Component Analysis (PCA).
My sample comprises firms listed on the S&P 500 index from 2010 to 2018. I extracted quarterly conference call transcripts for these firms, calculated the tone measure for each call, and matched the data with CEO narcissism data.
I control for factors which have been found to impact the tone during conference calls and given my data has panel structure (unbalanced as some data is missing for some firm-quarters) I choose to include firm and quarter fixed effects, which is in line with others studies in the field.
Despite thorough checks for data collection errors, including variations in PCA parameters and alternative approaches to creating the narcissism proxy, my results are counterintuitive. I anticipated a positive association between narcissistic CEOs and positive language, as seen in other firm disclosures, but I consistently observe either a non-significant coefficient or, unexpectedly, a negative one.
The model I am estimating is as follows:
Another paper with a similar data structure employed pooled OLS, from which I obtain similar results.
While I understand that a detailed analysis of the underlying data is necessary for a precise assessment, I would greatly appreciate any advice or suggestions you may have on how I should proceed.
Thank you for your time and assistance.
I find myself grappling with a dilemma in my current research project and would greatly appreciate some guidance on how to proceed.
In essence, my project delves into examining the correlation between the psychological trait of narcissism among firm CEOs and some linguistic features during quarterly conference calls. These calls serve as a platform for discussing the preceding quarter's firm results. One of my primary hypotheses posits a positive relationship between CEO narcissism and the tone of their speech. Specifically, I propose that more narcissistic CEOs employ a higher frequency of positive words (more positive tone) during their speeches compared to their less narcissistic counterparts.
To quantify this, I extract CEOs' speech from conference call transcripts, categorize words as positive or negative using a specialized dictionary, and then calculate the tone measure. The dependent variable is expressed as: (number of positive words - number of negative words) divided by the total number of words.
Measuring psychological traits can be challenging, and I rely on a measurement based on archival data specifically developed to capture narcissism in the business context. This measurement comprises 15 items (e.g., compensation of the CEO, no. of awards a CEO received, use of corporate private jet...) which I combine using Principal Component Analysis (PCA).
My sample comprises firms listed on the S&P 500 index from 2010 to 2018. I extracted quarterly conference call transcripts for these firms, calculated the tone measure for each call, and matched the data with CEO narcissism data.
I control for factors which have been found to impact the tone during conference calls and given my data has panel structure (unbalanced as some data is missing for some firm-quarters) I choose to include firm and quarter fixed effects, which is in line with others studies in the field.
Despite thorough checks for data collection errors, including variations in PCA parameters and alternative approaches to creating the narcissism proxy, my results are counterintuitive. I anticipated a positive association between narcissistic CEOs and positive language, as seen in other firm disclosures, but I consistently observe either a non-significant coefficient or, unexpectedly, a negative one.
The model I am estimating is as follows:
Code:
xtset CompanyID Quarter xtreg CEO_Tone Controls CEO_Narcissism i.Quarter, fe robust Fixed-effects (within) regression Number of obs = 6,163 Group variable: Data~OPermID Number of groups = 262 R-squared: Obs per group: Within = 0.1039 min = 1 Between = 0.0356 avg = 23.5 Overall = 0.0543 max = 36 F(52, 261) = 9.88 corr(u_i, Xb) = -0.1417 Prob > F = 0.0000 (Std. err. adjusted for 262 clusters in CompanyID) --------------------------------------------------------------------------------------- | Robust CEO_Tone | Coefficient std. err. t P>|t| [95% conf. interval] ----------------------+---------------------------------------------------------------- Con_ROA_2 | .008129 .0099925 0.81 0.417 -.0115472 .0278053 Con_Size_2 | .0004699 .0009772 0.48 0.631 -.0014543 .002394 Con_Lev | .0000651 .0020984 0.03 0.975 -.0040669 .0041971 Con_BTM | -.0032567 .0012968 -2.51 0.013 -.0058102 -.0007032 Con_Loss_1 | -.0012178 .0006943 -1.75 0.081 -.002585 .0001495 Con_Sales_Growth | .0035248 .0011005 3.20 0.002 .0013578 .0056917 Con_Ret_Quart_Compu | .0069279 .0011789 5.88 0.000 .0046066 .0092492 Con_No_Analyst_Follow | -.0000911 .0000731 -1.25 0.214 -.000235 .0000529 Con_Forecast_SD | -.0021832 .0040461 -0.54 0.590 -.0101502 .0057839 Con_Earn_Surp | .0020445 .0014209 1.44 0.151 -.0007533 .0048424 Con_Forecast_miss | -.0021387 .0003991 -5.36 0.000 -.0029247 -.0013528 Con_No_Geo_Seg | .0019137 .0017434 1.10 0.273 -.0015193 .0053467 Con_No_Bus_Seg | -.0001153 .001509 -0.08 0.939 -.0030866 .002856 Compustat_CEO_Age | .0001076 .0001333 0.81 0.420 -.0001549 .0003701 CEO_Tenure | -.0002694 .000122 -2.21 0.028 -.0005096 -.0000292 CEO_Gender | .0072985 .0033786 2.16 0.032 .0006459 .0139512 CEO_Narcissism | -.000374 .000171 -2.19 0.030 -.0007107 -.0000373
Another paper with a similar data structure employed pooled OLS, from which I obtain similar results.
While I understand that a detailed analysis of the underlying data is necessary for a precise assessment, I would greatly appreciate any advice or suggestions you may have on how I should proceed.
Thank you for your time and assistance.
Comment