Dear all, I would like to ask for your input on the following task. I have a monthly panel dataset of stock (identifier PERMNO) with variable date as a time variable. For each PERMNO and date, I have the following 2 variables: 1) market beta and 2) PVS beta (where PVS is a proxy for market sentiment). My goal is the following: each month, I have to calculate the cross-sectional Pearson product-moment correlation between market beta and PVS beta which I will call ρβ_βΔPVS. After calculating the cross-sectional correlation, I have to divide months in the sample into those with high and low cross-sectional correlation between market beta and PVS beta. High (low) ρβ_βΔPVS months are taken to be those months with values of ρβ_βΔPVS greater than or equal to (less than) the median ρβ_βΔPVS.
I have used the following code, but I am super unsure whether this is the correct way:
Code:
// stocks beta by PERMNO date
use "Beta Monthly new"
format ym %tm
// merge the beta PVS
merge 1:1 PERMNO ym using "bPVS FINAL"
drop if _merge == 1| _merge == 2
// clean dataset
collapse beta betaPVS, by(dateff PERMNO)
drop if beta == . | betaPVS == .
// panel dataset
egen firm = group(PERMNO)
egen time = group(date)
// Sort the data by date
sort date
// Create a variable to store the correlation results
gen ρβ_βΔPVS = .
// Generate a unique identifier for each month
egen month = group(date)
// Get the list of unique month identifiers
levelsof month, local(months)
// Loop over each month to calculate the correlation
foreach m of local months {
* Restrict the data to the current month
preserve
keep if month == `m'
* Check if there are enough observations to calculate correlation
quietly count
if r(N) > 1 {
* Debug: Print the current month and number of observations
di "Processing month: `m' with observations: " r(N)
* Calculate the correlation
corr beta betaPVS, means
* Debug: Print the correlation matrix
matrix list r(C)
* Extract the correlation coefficient
matrix C = r(C)
local corr = C[1,2]
* Restore the full dataset
restore
* Store the result in the dataset
replace ρβ_βΔPVS = `corr' if month == `m'
}
else {
* Debug: Print message if not enough observations
di "Not enough observations for month: `m'"
* Restore the full dataset if not enough observations
restore
}
}
// Check the results
list date ρβ_βΔPVS if !missing(ρβ_βΔPVS)
// Summarize the correlations
summarize ρβ_βΔPVS
// Calculate the median of the corr_beta_ivol
sum ρβ_βΔPVS, detail
// Split the sample based on median correlation
egen median_ρβ_βΔPVS = median(ρβ_βΔPVS)
gen high_low_corr = ρβ_βΔPVS >= median_ρβ_βΔPVS
// clean dataset
collapse high_low_corr, by (dateff PERMNO)
In case it is needed, this is a small portion of the dataset I have
PERMNO date beta betaPVS
10006 31aug1971 .79016378 -.00768161
10014 31aug1971 1.9341393 -.02881953
10057 31aug1971 .97684805 .00235655
10102 31aug1971 1.4654419 .00578056
10137 31aug1971 .80199136 -.00021375
10145 31aug1971 1.2391442 .0222873
10153 31aug1971 1.7417629 -.00638195
10161 31aug1971 .8160666 .01480798
10188 31aug1971 2.2849678 .0101683
10225 31aug1971 .79444095 -.00114427
10233 31aug1971 1.4605849 -.00501463
10241 31aug1971 .74535671 -.00248796
10276 31aug1971 .26219438 -.00930459
10321 31aug1971 1.3446186 -.01120439
10364 31aug1971 1.0343247 -.0062352
10372 31aug1971 .98055205 .01119765
10401 31aug1971 .7315204 .00118681
10460 31aug1971 1.7546463 .01137953
10479 31aug1971 1.1086954 -.01760248
10487 31aug1971 .50537227 -.01441397
10495 31aug1971 1.1820425 -.00525025
10516 31aug1971 1.1805342 -.00638053
10559 31aug1971 .76682362 .00182585
10604 31aug1971 1.3611181 -.00065288
10698 31aug1971 1.0273563 -.00911494
10727 31aug1971 .32257196 -.00793185
10743 31aug1971 1.0391832 .01016556
10751 31aug1971 1.6242846 -.01652368
Thank to everyone in advance for you answer,
Please let me know if something else is needed from my side
Best regards,
Giovanni Dell'Erba
I have used the following code, but I am super unsure whether this is the correct way:
Code:
// stocks beta by PERMNO date
use "Beta Monthly new"
format ym %tm
// merge the beta PVS
merge 1:1 PERMNO ym using "bPVS FINAL"
drop if _merge == 1| _merge == 2
// clean dataset
collapse beta betaPVS, by(dateff PERMNO)
drop if beta == . | betaPVS == .
// panel dataset
egen firm = group(PERMNO)
egen time = group(date)
// Sort the data by date
sort date
// Create a variable to store the correlation results
gen ρβ_βΔPVS = .
// Generate a unique identifier for each month
egen month = group(date)
// Get the list of unique month identifiers
levelsof month, local(months)
// Loop over each month to calculate the correlation
foreach m of local months {
* Restrict the data to the current month
preserve
keep if month == `m'
* Check if there are enough observations to calculate correlation
quietly count
if r(N) > 1 {
* Debug: Print the current month and number of observations
di "Processing month: `m' with observations: " r(N)
* Calculate the correlation
corr beta betaPVS, means
* Debug: Print the correlation matrix
matrix list r(C)
* Extract the correlation coefficient
matrix C = r(C)
local corr = C[1,2]
* Restore the full dataset
restore
* Store the result in the dataset
replace ρβ_βΔPVS = `corr' if month == `m'
}
else {
* Debug: Print message if not enough observations
di "Not enough observations for month: `m'"
* Restore the full dataset if not enough observations
restore
}
}
// Check the results
list date ρβ_βΔPVS if !missing(ρβ_βΔPVS)
// Summarize the correlations
summarize ρβ_βΔPVS
// Calculate the median of the corr_beta_ivol
sum ρβ_βΔPVS, detail
// Split the sample based on median correlation
egen median_ρβ_βΔPVS = median(ρβ_βΔPVS)
gen high_low_corr = ρβ_βΔPVS >= median_ρβ_βΔPVS
// clean dataset
collapse high_low_corr, by (dateff PERMNO)
In case it is needed, this is a small portion of the dataset I have
PERMNO date beta betaPVS
10006 31aug1971 .79016378 -.00768161
10014 31aug1971 1.9341393 -.02881953
10057 31aug1971 .97684805 .00235655
10102 31aug1971 1.4654419 .00578056
10137 31aug1971 .80199136 -.00021375
10145 31aug1971 1.2391442 .0222873
10153 31aug1971 1.7417629 -.00638195
10161 31aug1971 .8160666 .01480798
10188 31aug1971 2.2849678 .0101683
10225 31aug1971 .79444095 -.00114427
10233 31aug1971 1.4605849 -.00501463
10241 31aug1971 .74535671 -.00248796
10276 31aug1971 .26219438 -.00930459
10321 31aug1971 1.3446186 -.01120439
10364 31aug1971 1.0343247 -.0062352
10372 31aug1971 .98055205 .01119765
10401 31aug1971 .7315204 .00118681
10460 31aug1971 1.7546463 .01137953
10479 31aug1971 1.1086954 -.01760248
10487 31aug1971 .50537227 -.01441397
10495 31aug1971 1.1820425 -.00525025
10516 31aug1971 1.1805342 -.00638053
10559 31aug1971 .76682362 .00182585
10604 31aug1971 1.3611181 -.00065288
10698 31aug1971 1.0273563 -.00911494
10727 31aug1971 .32257196 -.00793185
10743 31aug1971 1.0391832 .01016556
10751 31aug1971 1.6242846 -.01652368
Thank to everyone in advance for you answer,
Please let me know if something else is needed from my side
Best regards,
Giovanni Dell'Erba
Comment