Dummy variable for dates

Anouk Hoek

Join Date: Jan 2023

Posts: 6
#1

Dummy variable for dates

14 Nov 2024, 07:11

Hi everyone,
I am using the first sheet of this excel file in stata. What I have done so far is the following:
//for the dates
gen adjusted_hour = HH - 1
gen year = floor(YYYYMMDD / 10000)
gen month = floor((YYYYMMDD - year * 10000) / 100)
gen day = mod(YYYYMMDD, 100)
gen double datetime = dhms(mdy(month, day, year), adjusted_hour, 0, 0)
format datetime %tc
drop year month day

//to reshape data from wide to long format:
rename AmsterdamStadhouderskadeNOx pollutant1
rename AmsterdamJanvanGalenstraatNO pollutant2
rename AmsterdamVanDiemenstraatNOx pollutant3
rename AmsterdamHaarlemmerwegNOx pollutant4
reshape long pollutant, i(datetime) j(location_id)

gen location_name = ""
replace location_name = "AmsterdamStadhouderskadeNOx" if location_id == 1
replace location_name = "Amsterdam Janvan GalenstraatNO" if location_id == 2
replace location_name = "AmsterdamVanDiemenstraatNOx" if location_id == 3
replace location_name = "AmsterdamHaarlemmerwegNOx" if location_id == 4

I now want to create a dummy variable with the value 1 for December 8, 2022, and onwards and 0 otherwise, but somehow this does not want to work. I have tried multiple things with chatgpt, but I either only get 0's or 1's.

Things I have tried:
(1)
* Define the cutoff date (December 8, 2022, at 00:00:00)
gen double reduction_date = mdy(12, 8, 2022) + 0 // Using mdy() for December 8, 2022
* Create the post_reduction variable: 0 for before Dec 8, 2022, and 1 for Dec 8, 2022 or later
gen post_reduction = (datetime >= reduction_date)
* Check the result
list datetime post_reduction in 1/10

(2)
* Ensure the datetime variable is in Stata date format (remove the time part)
gen date = dofc(datetime)
* Format the date variable to display as dd/mm/yyyy
format date %td
* Check the numeric value of the date on December 8, 2022, and ensure the date comparison works
display mdy(12, 8, 2022)
* Create the post-reduction variable by comparing date correctly
gen post_reduction = (date == mdy(12, 8, 2022))

(3)
// Create the post_reduction variable (1 for December 8, 2022 and onwards)
gen date_only = dofd(datetime) // Extract date from datetime
format date_only %td // Apply date format to the new date-only variable
gen post_reduction = (date_only >= td(08dec2022)) // Create post_reduction variable

(4)
gen post_reduction = (datetime >= clock("08dec2022 00:00", "DMY hms"))

However, this all did not work. I am a bit lost now. So I was hoping someone could help me. I am not really familiar with Stata, so I am using ChatGPT a lot.
Attached Files

definitief 2022-2023 AMS.xlsx (1.17 MB, 1 view)
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 9773
#2

14 Nov 2024, 07:41

You already have five posts on Statalist, so you should not be attaching spreadsheets. Instead, use the dataex command as described in FAQ Advice #12. Assuming that you have a proper datetime variable, see the convenience function -td()-

Code:

help td()

Code:

gen wanted = dofc(datetime)>= td(8dec2022)

Last edited by Andrew Musau; 14 Nov 2024, 07:48.
1 like
Comment
Anouk Hoek

Join Date: Jan 2023

Posts: 6
#3

14 Nov 2024, 07:58

Thank you Andrew Musau
Comment

Announcement

Dummy variable for dates

Comment

Comment