Hello Statalist,
I have an interesting problem that I have been struggling to solve and to get out into words, so I will try the best I can. I have a dataset that includes air travel ticket information including route, air carrier, fare, etc. I am trying to run a Difference-in-Difference analysis using my treatment group as routes serviced by either 'AA' (American Airlines) or 'US' (US Airways). I am wondering how I can generate a dummy indicator variable for each observation (ticket) that describes whether the ticket was for a route that is (or was) serviced by either 'AA' or 'US.' This is fairly simple when 'AA' or 'US' is listed as the carrier for the ticket, but more complex when 'AA' or 'US' is not the carrier but 'AA' or 'US' does service that route.
For some dataset context, yq is my date variable, tkcarrier is the airline selling the ticket, and overallroute describes the route using IATA airport codes, appended using ':'.
I hope I have adequatley explained the problem I am trying to solve. I will include a sample of my dataset below. I apologize if I used -dataex- incorrectly, as it is my first time.
Thanks!
I have an interesting problem that I have been struggling to solve and to get out into words, so I will try the best I can. I have a dataset that includes air travel ticket information including route, air carrier, fare, etc. I am trying to run a Difference-in-Difference analysis using my treatment group as routes serviced by either 'AA' (American Airlines) or 'US' (US Airways). I am wondering how I can generate a dummy indicator variable for each observation (ticket) that describes whether the ticket was for a route that is (or was) serviced by either 'AA' or 'US.' This is fairly simple when 'AA' or 'US' is listed as the carrier for the ticket, but more complex when 'AA' or 'US' is not the carrier but 'AA' or 'US' does service that route.
For some dataset context, yq is my date variable, tkcarrier is the airline selling the ticket, and overallroute describes the route using IATA airport codes, appended using ':'.
I hope I have adequatley explained the problem I am trying to solve. I will include a sample of my dataset below. I apologize if I used -dataex- incorrectly, as it is my first time.
Thanks!
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float yq str2 tkcarrier str7 overallroute 211 "UA" "LAX:DEN" 217 "AA" "SAT:LGA" 233 "AA" "ILM:ORD" 204 "UA" "DEN:BUR" 222 "DL" "MDW:BUF" 204 "DL" "GRB:DSM" 208 "WN" "BHM:HOU" 238 "UA" "DFW:PWM" 217 "UA" "AUS:ROC" 232 "DL" "CLE:JAX" 239 "WN" "LAX:IND" 205 "US" "SFO:IAH" 236 "DL" "PDX:DAY" 233 "WN" "SEA:GRR" 211 "DL" "LGB:SLC" 222 "DL" "BNA:MSY" 235 "AA" "BDL:MKE" 200 "CO" "EWR:RDU" 224 "AA" "EYW:ORF" 237 "AA" "LAX:OMA" end format %tq yq
Comment