Hi everyone,
I am using the first sheet of this excel file in stata. What I have done so far is the following:
//for the dates
gen adjusted_hour = HH - 1
gen year = floor(YYYYMMDD / 10000)
gen month = floor((YYYYMMDD - year * 10000) / 100)
gen day = mod(YYYYMMDD, 100)
gen double datetime = dhms(mdy(month, day, year), adjusted_hour, 0, 0)
format datetime %tc
drop year month day
//to reshape data from wide to long format:
rename AmsterdamStadhouderskadeNOx pollutant1
rename AmsterdamJanvanGalenstraatNO pollutant2
rename AmsterdamVanDiemenstraatNOx pollutant3
rename AmsterdamHaarlemmerwegNOx pollutant4
reshape long pollutant, i(datetime) j(location_id)
gen location_name = ""
replace location_name = "AmsterdamStadhouderskadeNOx" if location_id == 1
replace location_name = "Amsterdam Janvan GalenstraatNO" if location_id == 2
replace location_name = "AmsterdamVanDiemenstraatNOx" if location_id == 3
replace location_name = "AmsterdamHaarlemmerwegNOx" if location_id == 4
I now want to create a dummy variable with the value 1 for December 8, 2022, and onwards and 0 otherwise, but somehow this does not want to work. I have tried multiple things with chatgpt, but I either only get 0's or 1's.
Things I have tried:
(1)
* Define the cutoff date (December 8, 2022, at 00:00:00)
gen double reduction_date = mdy(12, 8, 2022) + 0 // Using mdy() for December 8, 2022
* Create the post_reduction variable: 0 for before Dec 8, 2022, and 1 for Dec 8, 2022 or later
gen post_reduction = (datetime >= reduction_date)
* Check the result
list datetime post_reduction in 1/10
(2)
* Ensure the datetime variable is in Stata date format (remove the time part)
gen date = dofc(datetime)
* Format the date variable to display as dd/mm/yyyy
format date %td
* Check the numeric value of the date on December 8, 2022, and ensure the date comparison works
display mdy(12, 8, 2022)
* Create the post-reduction variable by comparing date correctly
gen post_reduction = (date == mdy(12, 8, 2022))
(3)
// Create the post_reduction variable (1 for December 8, 2022 and onwards)
gen date_only = dofd(datetime) // Extract date from datetime
format date_only %td // Apply date format to the new date-only variable
gen post_reduction = (date_only >= td(08dec2022)) // Create post_reduction variable
(4)
gen post_reduction = (datetime >= clock("08dec2022 00:00", "DMY hms"))
However, this all did not work. I am a bit lost now. So I was hoping someone could help me. I am not really familiar with Stata, so I am using ChatGPT a lot.
I am using the first sheet of this excel file in stata. What I have done so far is the following:
//for the dates
gen adjusted_hour = HH - 1
gen year = floor(YYYYMMDD / 10000)
gen month = floor((YYYYMMDD - year * 10000) / 100)
gen day = mod(YYYYMMDD, 100)
gen double datetime = dhms(mdy(month, day, year), adjusted_hour, 0, 0)
format datetime %tc
drop year month day
//to reshape data from wide to long format:
rename AmsterdamStadhouderskadeNOx pollutant1
rename AmsterdamJanvanGalenstraatNO pollutant2
rename AmsterdamVanDiemenstraatNOx pollutant3
rename AmsterdamHaarlemmerwegNOx pollutant4
reshape long pollutant, i(datetime) j(location_id)
gen location_name = ""
replace location_name = "AmsterdamStadhouderskadeNOx" if location_id == 1
replace location_name = "Amsterdam Janvan GalenstraatNO" if location_id == 2
replace location_name = "AmsterdamVanDiemenstraatNOx" if location_id == 3
replace location_name = "AmsterdamHaarlemmerwegNOx" if location_id == 4
I now want to create a dummy variable with the value 1 for December 8, 2022, and onwards and 0 otherwise, but somehow this does not want to work. I have tried multiple things with chatgpt, but I either only get 0's or 1's.
Things I have tried:
(1)
* Define the cutoff date (December 8, 2022, at 00:00:00)
gen double reduction_date = mdy(12, 8, 2022) + 0 // Using mdy() for December 8, 2022
* Create the post_reduction variable: 0 for before Dec 8, 2022, and 1 for Dec 8, 2022 or later
gen post_reduction = (datetime >= reduction_date)
* Check the result
list datetime post_reduction in 1/10
(2)
* Ensure the datetime variable is in Stata date format (remove the time part)
gen date = dofc(datetime)
* Format the date variable to display as dd/mm/yyyy
format date %td
* Check the numeric value of the date on December 8, 2022, and ensure the date comparison works
display mdy(12, 8, 2022)
* Create the post-reduction variable by comparing date correctly
gen post_reduction = (date == mdy(12, 8, 2022))
(3)
// Create the post_reduction variable (1 for December 8, 2022 and onwards)
gen date_only = dofd(datetime) // Extract date from datetime
format date_only %td // Apply date format to the new date-only variable
gen post_reduction = (date_only >= td(08dec2022)) // Create post_reduction variable
(4)
gen post_reduction = (datetime >= clock("08dec2022 00:00", "DMY hms"))
However, this all did not work. I am a bit lost now. So I was hoping someone could help me. I am not really familiar with Stata, so I am using ChatGPT a lot.
Comment