Dear Statalist users,
I am currently conducting a research project which analyzes the reaction of stock returns on earnings-announcements.
My first dataset contains daily data for stock returns, where each security is identified by an ID (ISIN). The dates in my daily stock return dataset only contain days from Monday to Friday. Some returns are missing (due to European holidays).
My second dataset contains quarterly accounting data, the publication date of the data and the security ID (ISIN).
Both datasets are longitudinal.
The problem is the following:
1) I want to merge the second dataset (accounting values) to the daily stock price data (first dataset) via ISIN (security ID) and Date.
However some dates of the accounting publications are not contained in the daily stock return data (e.g. days on the weekend and some business days, that are missing in the return dataset). Therefore using a 1:1 merge results in accounting datapoints that are not properly merged (_merge==2) with the price data.
My goal is to assign the accounting values where the date of the publication is missing in the return data, to the next day in my return dataset, which has non-missing return data (in order to evaluate the effect of the publication).
2) A further complication is, that the publication dates of the accounting values contain the time (the format for the publication dates is %tc, e.g. 27feb2018 07:01:00). The date format for the daily returns is given in %tdnn/dd/CCYY.
As European markets close at 5:30 pm (17:30:00), I would like to assign publication dates from 17:30:00 onwards to the next trading day (in my return dataset) with non-missing return data.
Unfortunately, I could not come up with a solution for part 1). I would really appreciate any help.
With regards to Problem 2) I have tried to solve the issue in the following way:
I would also appreciate any comments on my procedure regarding problem 2)
A few datapoints regarding problem 2)
I am currently conducting a research project which analyzes the reaction of stock returns on earnings-announcements.
My first dataset contains daily data for stock returns, where each security is identified by an ID (ISIN). The dates in my daily stock return dataset only contain days from Monday to Friday. Some returns are missing (due to European holidays).
My second dataset contains quarterly accounting data, the publication date of the data and the security ID (ISIN).
Both datasets are longitudinal.
The problem is the following:
1) I want to merge the second dataset (accounting values) to the daily stock price data (first dataset) via ISIN (security ID) and Date.
However some dates of the accounting publications are not contained in the daily stock return data (e.g. days on the weekend and some business days, that are missing in the return dataset). Therefore using a 1:1 merge results in accounting datapoints that are not properly merged (_merge==2) with the price data.
My goal is to assign the accounting values where the date of the publication is missing in the return data, to the next day in my return dataset, which has non-missing return data (in order to evaluate the effect of the publication).
2) A further complication is, that the publication dates of the accounting values contain the time (the format for the publication dates is %tc, e.g. 27feb2018 07:01:00). The date format for the daily returns is given in %tdnn/dd/CCYY.
As European markets close at 5:30 pm (17:30:00), I would like to assign publication dates from 17:30:00 onwards to the next trading day (in my return dataset) with non-missing return data.
Unfortunately, I could not come up with a solution for part 1). I would really appreciate any help.
With regards to Problem 2) I have tried to solve the issue in the following way:
Code:
gen Publication_Date = dofc(DateTime) format Publication_Date %td // return the date of the publication without the time gen double step1=cofd(Publication_Date) // add a time factor to the publication date format step1 %tc gen double close = step1 + msofhours(17) + msofminutes(30) +msofseconds(0) // generate a variable for the date and time of the market close format %tc close drop Publication_Date gen double assignment = close + msofhours(24) if DateTime>=close // add one day to the day of the initial publication, if the data is published after the close format %tc assignment replace assignment=DateTime if assignment==. gen assignment_date = dofc(assignment) format assignment_date %td
A few datapoints regarding problem 2)
Code:
clear input float ID double DateTime 1 1837806780000.0002 2 1810530059999.9998 2 1.825692e+12 2 1856554020000.0002 2 1842196860000 3 1.83654e+12 3 1850713199999.9998 3 1819270500000.0002 end format %tc DateTime
Comment