Dear all,
I am working on the data of traffic violations recorded in one year. as shown in the table below, every row in the database is related to a unique traffic violation. there is 267,000 traffic violations recorded in this database, so the id is from 1 to 267,000. First colomn indicates the type of traffic violation that was occurred (3 types of traffic violations is provided in this database). Other three characteristics of every violations that we have in this database are related to the driver that committed that violation. For example, the first row indicated that a seat belt violation occurred and we know a driver is 37 years old, is man and his education level is 4 committed that violation. I tried to provide a Multinomial logit model to identify the impact of age, sex and education levels on the drivers' violations type. For example, I want to report that by increasing age of driver, the probability to commit speeding violation will decrease. I have 3 question:
1: Did I follow a appropriate process and was the multinomial logit suitable for this purpose?
2: I want to consider the impact of unobserved variables that may affect the response variable (choosing between 3 violation types). For this purpose, Should I provide a mixed multinomial logit model? or other models? what will be the code for that model in stata?
3: For providing random effects multinomial logit model , Is it possible to provide this type of model to a data that is not panel data ( for example the data I use includes just one year traffic violations)? if the answer is yes, what is the code in Stata?
Tags: None
I am working on the data of traffic violations recorded in one year. as shown in the table below, every row in the database is related to a unique traffic violation. there is 267,000 traffic violations recorded in this database, so the id is from 1 to 267,000. First colomn indicates the type of traffic violation that was occurred (3 types of traffic violations is provided in this database). Other three characteristics of every violations that we have in this database are related to the driver that committed that violation. For example, the first row indicated that a seat belt violation occurred and we know a driver is 37 years old, is man and his education level is 4 committed that violation. I tried to provide a Multinomial logit model to identify the impact of age, sex and education levels on the drivers' violations type. For example, I want to report that by increasing age of driver, the probability to commit speeding violation will decrease. I have 3 question:
1: Did I follow a appropriate process and was the multinomial logit suitable for this purpose?
2: I want to consider the impact of unobserved variables that may affect the response variable (choosing between 3 violation types). For this purpose, Should I provide a mixed multinomial logit model? or other models? what will be the code for that model in stata?
3: For providing random effects multinomial logit model , Is it possible to provide this type of model to a data that is not panel data ( for example the data I use includes just one year traffic violations)? if the answer is yes, what is the code in Stata?
id | violation type | driver's age |
|
driver's educations level (1 to 8) | |
1 | seat belt | 37 | 1 | 4 | |
2 | speeding | 19 | 1 | 7 | |
3 | seat belt | 24 | 1 | 2 | |
4 | using mobile phone | 30 | 0 | 5 | |
5 | speeding | 28 | 1 | 5 |
Comment