Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pre-Post analysis: How to compare frequencies for each category across rounds?

    Hi all,

    I have a panel data of the following form:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str11 ID float period strL INDUSTRY
    "10000111_1"  1 "Not Applicable"                                                         
    "10000111_10" 1 "Not Applicable"                                                         
    "10000111_11" 1 "Not Applicable"                                                         
    "10000111_12" 1 "Not Applicable"                                                         
    "10000111_13" 1 "Not Applicable"                                                         
    "10000111_14" 1 "Not Applicable"                                                         
    "10000111_15" 1 "Not Applicable"                                                         
    "10000111_16" 1 "Not Applicable"                                                         
    "10000111_2"  1 "Retail Trade"                                                           
    "10000111_3"  1 "Not Applicable"                                                         
    "10000111_4"  1 "Not Applicable"                                                         
    "10000111_5"  1 "Not Applicable"                                                         
    "10000111_6"  1 "Not Applicable"                                                         
    "10000111_7"  1 "Not Applicable"                                                         
    "10000111_8"  1 "Not Applicable"                                                         
    "10000111_9"  1 "Not Applicable"                                                         
    "10001060_1"  0 "Crop Cultivation"                                                       
    "10001060_1"  1 "Crop Cultivation"                                                       
    "10001060_1"  1 "Crop Cultivation"                                                       
    "10001060_2"  0 "Crop Cultivation"                                                       
    "10001060_2"  1 "Crop Cultivation"                                                       
    "10001060_2"  1 "Crop Cultivation"                                                       
    "10001060_3"  0 "Not Applicable"                                                         
    "10001060_3"  1 "Not Applicable"                                                         
    "10001060_3"  1 "Not Applicable"                                                         
    "10001060_4"  0 "Not Applicable"                                                         
    "10001060_4"  1 "Not Applicable"                                                         
    "10001060_4"  1 "Not Applicable"                                                         
    "10001079_1"  0 "Cement, Tiles, Bricks, Ceramics, Glass and other construction materials"
    "10001079_1"  0 "Crop Cultivation"                                                       
    "10001079_1"  1 "Crop Cultivation"                                                       
    "10001079_1"  1 "Crop Cultivation"                                                       
    "10001079_2"  0 "Not Applicable"                                                         
    "10001079_2"  0 "Not Applicable"                                                         
    "10001079_2"  1 "Not Applicable"                                                         
    "10001079_2"  1 "Not Applicable"                                                         
    "10001079_3"  0 "Not Applicable"                                                         
    "10001079_3"  0 "Not Applicable"                                                         
    "10001079_3"  1 "Not Applicable"                                                         
    "10001079_3"  1 "Not Applicable"                                                         
    "10001079_4"  0 "Not Applicable"                                                         
    "10001079_4"  0 "Not Applicable"                                                         
    "10001079_4"  1 "Not Applicable"                                                         
    "10001079_4"  1 "Not Applicable"                                                         
    "10001079_5"  0 "Not Applicable"                                                         
    "10001079_5"  0 "Not Applicable"                                                         
    "10001079_5"  1 "Not Applicable"                                                         
    "10001079_5"  1 "Not Applicable"                                                         
    "10002083_1"  0 "Not Applicable"                                                         
    "10002083_1"  1 "Not Applicable"                                                         
    "10002083_1"  1 "Retail Trade"                                                           
    "10002083_2"  0 "Not Applicable"                                                         
    "10002083_2"  1 "Not Applicable"                                                         
    "10002083_2"  1 "Not Applicable"                                                         
    "10002083_3"  0 "Not Applicable"                                                         
    "10002083_3"  1 "Not Applicable"                                                         
    "10002083_3"  1 "Not Applicable"                                                         
    "10002083_4"  0 "Not Applicable"                                                         
    "10002083_4"  1 "Not Applicable"                                                         
    "10002083_4"  1 "Not Applicable"                                                         
    "10002083_5"  0 "Not Applicable"                                                         
    "10002083_5"  1 "Not Applicable"                                                         
    "10002083_5"  1 "Not Applicable"                                                         
    "10002181_1"  0 "Not Applicable"                                                         
    "10002181_1"  0 "Not Applicable"                                                         
    "10002181_1"  1 "Not Applicable"                                                         
    "10002181_1"  1 "Not Applicable"                                                         
    "10002181_2"  0 "Repair, Maintenance and Operations (RMO)"                               
    "10002181_2"  0 "Personal Non-Professional Services"                                     
    "10002181_2"  1 "Travel and Tourism"                                                     
    "10002181_2"  1 "Travel and Tourism"                                                     
    "10002181_3"  0 "Not Applicable"                                                         
    "10002181_3"  0 "Not Applicable"                                                         
    "10002181_3"  1 "Not Applicable"                                                         
    "10002181_3"  1 "Personal Non-Professional Services"                                     
    "10002477_1"  0 "Not Applicable"                                                         
    "10002477_1"  0 "Not Applicable"                                                         
    "10002477_1"  1 "Not Applicable"                                                         
    "10002477_1"  1 "Not Applicable"                                                         
    "10002477_10" 1 "Not Applicable"                                                         
    "10002477_10" 1 "Not Applicable"                                                         
    "10002477_2"  0 "Not Applicable"                                                         
    "10002477_2"  0 "Not Applicable"                                                         
    "10002477_2"  1 "Food Industries"                                                        
    "10002477_2"  1 "Not Applicable"                                                         
    "10002477_3"  0 "Travel and Tourism"                                                     
    "10002477_3"  0 "Cement, Tiles, Bricks, Ceramics, Glass and other construction materials"
    "10002477_3"  1 "Travel and Tourism"                                                     
    "10002477_3"  1 "Automobiles and Other Transport Equipment Manufacturers"                
    "10002477_4"  0 "Not Applicable"                                                         
    "10002477_4"  0 "Not Applicable"                                                         
    "10002477_4"  1 "Fruits and Vegetable Farming"                                           
    "10002477_4"  1 "Food Industries"                                                        
    "10002477_5"  0 "Not Applicable"                                                         
    "10002477_5"  0 "Not Applicable"                                                         
    "10002477_5"  1 "Not Applicable"                                                         
    "10002477_5"  1 "Not Applicable"                                                         
    "10002477_6"  0 "Not Applicable"                                                         
    "10002477_6"  0 "Not Applicable"                                                         
    "10002477_6"  1 "Not Applicable"                                                         
    end
    Period 0 is the period before a policy change was implemented, Period 1 is the period after policy implementation.
    To get a summary of the variable INDUSTRY, I did
    Code:
    tab INDUSTRY if period==0
    and
    Code:
    tab INDUSTRY  if period==1
    By looking at the summary, it would be apparent that the percentages under each of the industry categories have undergone changes. Some have gone up and some have gone down. The policy intervention in question is expected to have adverse impact on employment so a fall in the percentages is our focus of study. Now I want to see if the percentages are significantly different. In a way I'm thinking of something like chi sq test for comparison of frequency such that I can see if "freq_category`i'_period 0" is significantly different from "freq_category`i'_period 1". But my understanding is that the Chi sq test can be applied if pre and post samples are independent? Which led me to think that I'd need a different test for panel data.

    I would be extremely grateful if someone could guide me towards an appropriate test in this case.

    Regards,


  • #2
    Can you explain in greater detail the structure in your data? You have a variable named ID that appears to identify nothing. The same value of ID can be associated with multiple industries, even during a single period. So it isn't' clear what unit of analysis each observation represents. What is the sampling scheme that generated this data? What is this variable ID? To what extent are the observations in your data set repeated observations of the same entities, and to what extent are they independent? How can you tell if two observations in your data set are observations of the same entity and when they are different? Are there actually matched pre-post pairs in your data? If so, how do you find them in the data?

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Can you explain in greater detail the structure in your data? You have a variable named ID that appears to identify nothing. The same value of ID can be associated with multiple industries, even during a single period. So it isn't' clear what unit of analysis each observation represents. What is the sampling scheme that generated this data? What is this variable ID? To what extent are the observations in your data set repeated observations of the same entities, and to what extent are they independent? How can you tell if two observations in your data set are observations of the same entity and when they are different? Are there actually matched pre-post pairs in your data? If so, how do you find them in the data?
      Hello Clyde.. Apologies for missing out these important details. I'm explaining them below:

      This is data on each member of different households. So the ID variable identifies individuals.

      For each individual, data on their primary occupation has been recorded. The variable INDUSTRY refers to industry of that primary occupation. So, during a single period, each ID would be associated with only one INDUSTRY as only the primary occupation for each individual has been recorded.

      This is a longitudinal survey. The same households, and the same members from each household have been tracked over various survey rounds and each individual has been uniquely identified (through the ID variable) across rounds. There is little variation in the response rates across rounds, and we can take the observations as repeated observations of the same entity.

      I would be able to tell if two observations relate to the same entity by looking at the ID. The ID uniquely identifies each individual. So an ID appearing in period 0 and period 1 would tell me that the former is about a particular individual in year 0 and the latter is about the same individual in period 1.

      Please let me know if this provides any clarity. Thanks

      Comment


      • #4
        Well, it's much clearer now, but your data do not actually match your description:

        So, during a single period, each ID would be associated with only one INDUSTRY as only the primary occupation for each individual has been recorded.
        Code:
          +-----------------------------------------------------------------------------------------------+
          |         ID   period                                                                  INDUSTRY |
          |-----------------------------------------------------------------------------------------------|
          | 10001079_1        0   Cement, Tiles, Bricks, Ceramics, Glass and other construction materials |
          | 10001079_1        0                                                          Crop Cultivation |
          |-----------------------------------------------------------------------------------------------|
          | 10002083_1        1                                                            Not Applicable |
          | 10002083_1        1                                                              Retail Trade |
          |-----------------------------------------------------------------------------------------------|
          | 10002181_2        0                                        Personal Non-Professional Services |
          | 10002181_2        0                                  Repair, Maintenance and Operations (RMO) |
          |-----------------------------------------------------------------------------------------------|
          | 10002181_3        1                                                            Not Applicable |
          | 10002181_3        1                                        Personal Non-Professional Services |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_2        1                                                           Food Industries |
          | 10002477_2        1                                                            Not Applicable |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_3        0   Cement, Tiles, Bricks, Ceramics, Glass and other construction materials |
          | 10002477_3        0                                                        Travel and Tourism |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_3        1                   Automobiles and Other Transport Equipment Manufacturers |
          | 10002477_3        1                                                        Travel and Tourism |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_4        1                                                           Food Industries |
          | 10002477_4        1                                              Fruits and Vegetable Farming |
          +-----------------------------------------------------------------------------------------------+
        So before we worry about the details of analysis, you need to fix your data set.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Well, it's much clearer now, but your data do not actually match your description:



          Code:
          +-----------------------------------------------------------------------------------------------+
          | ID period INDUSTRY |
          |-----------------------------------------------------------------------------------------------|
          | 10001079_1 0 Cement, Tiles, Bricks, Ceramics, Glass and other construction materials |
          | 10001079_1 0 Crop Cultivation |
          |-----------------------------------------------------------------------------------------------|
          | 10002083_1 1 Not Applicable |
          | 10002083_1 1 Retail Trade |
          |-----------------------------------------------------------------------------------------------|
          | 10002181_2 0 Personal Non-Professional Services |
          | 10002181_2 0 Repair, Maintenance and Operations (RMO) |
          |-----------------------------------------------------------------------------------------------|
          | 10002181_3 1 Not Applicable |
          | 10002181_3 1 Personal Non-Professional Services |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_2 1 Food Industries |
          | 10002477_2 1 Not Applicable |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_3 0 Cement, Tiles, Bricks, Ceramics, Glass and other construction materials |
          | 10002477_3 0 Travel and Tourism |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_3 1 Automobiles and Other Transport Equipment Manufacturers |
          | 10002477_3 1 Travel and Tourism |
          |-----------------------------------------------------------------------------------------------|
          | 10002477_4 1 Food Industries |
          | 10002477_4 1 Fruits and Vegetable Farming |
          +-----------------------------------------------------------------------------------------------+
          So before we worry about the details of analysis, you need to fix your data set.
          That was due to the way I had coded the pre-post time periods, which I realise now, was incorrect. I have made corrections. In this data, periods 7 and 8 are pre-policy,and periods 9 and 10 are post-policy. Now each ID should be associated with 4 time periods: 7,8,9,10 and would have corresponding entries on Industry. Please let me know if this looks okay. Also, thank you for taking the time to go through my question and highlighting my mistakes. The corrected data set is shown below:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str11 ID float period strL INDUSTRY
          "10000111_1"   7 "Data Not Available"                                                     
          "10000111_1"   8 "Data Not Available"                                                     
          "10000111_1"   9 "Data Not Available"                                                     
          "10000111_1"  10 "Not Applicable"                                                         
          "10000111_10"  7 "Data Not Available"                                                     
          "10000111_10"  8 "Data Not Available"                                                     
          "10000111_10"  9 "Data Not Available"                                                     
          "10000111_10" 10 "Not Applicable"                                                         
          "10000111_11"  7 "Data Not Available"                                                     
          "10000111_11"  8 "Data Not Available"                                                     
          "10000111_11"  9 "Data Not Available"                                                     
          "10000111_11" 10 "Not Applicable"                                                         
          "10000111_12"  7 "Data Not Available"                                                     
          "10000111_12"  8 "Data Not Available"                                                     
          "10000111_12"  9 "Data Not Available"                                                     
          "10000111_12" 10 "Not Applicable"                                                         
          "10000111_13"  7 "Data Not Available"                                                     
          "10000111_13"  8 "Data Not Available"                                                     
          "10000111_13"  9 "Data Not Available"                                                     
          "10000111_13" 10 "Not Applicable"                                                         
          "10000111_14"  7 "Data Not Available"                                                     
          "10000111_14"  8 "Data Not Available"                                                     
          "10000111_14"  9 "Data Not Available"                                                     
          "10000111_14" 10 "Not Applicable"                                                         
          "10000111_15"  7 "Data Not Available"                                                     
          "10000111_15"  8 "Data Not Available"                                                     
          "10000111_15"  9 "Data Not Available"                                                     
          "10000111_15" 10 "Not Applicable"                                                         
          "10000111_16"  7 "Data Not Available"                                                     
          "10000111_16"  8 "Data Not Available"                                                     
          "10000111_16"  9 "Data Not Available"                                                     
          "10000111_16" 10 "Not Applicable"                                                         
          "10000111_2"   7 "Data Not Available"                                                     
          "10000111_2"   8 "Data Not Available"                                                     
          "10000111_2"   9 "Data Not Available"                                                     
          "10000111_2"  10 "Retail Trade"                                                           
          "10000111_3"   7 "Data Not Available"                                                     
          "10000111_3"   8 "Data Not Available"                                                     
          "10000111_3"   9 "Data Not Available"                                                     
          "10000111_3"  10 "Not Applicable"                                                         
          "10000111_4"   7 "Data Not Available"                                                     
          "10000111_4"   8 "Data Not Available"                                                     
          "10000111_4"   9 "Data Not Available"                                                     
          "10000111_4"  10 "Not Applicable"                                                         
          "10000111_5"   7 "Data Not Available"                                                     
          "10000111_5"   8 "Data Not Available"                                                     
          "10000111_5"   9 "Data Not Available"                                                     
          "10000111_5"  10 "Not Applicable"                                                         
          "10000111_6"   7 "Data Not Available"                                                     
          "10000111_6"   8 "Data Not Available"                                                     
          "10000111_6"   9 "Data Not Available"                                                     
          "10000111_6"  10 "Not Applicable"                                                         
          "10000111_7"   7 "Data Not Available"                                                     
          "10000111_7"   8 "Data Not Available"                                                     
          "10000111_7"   9 "Data Not Available"                                                     
          "10000111_7"  10 "Not Applicable"                                                         
          "10000111_8"   7 "Data Not Available"                                                     
          "10000111_8"   8 "Data Not Available"                                                     
          "10000111_8"   9 "Data Not Available"                                                     
          "10000111_8"  10 "Not Applicable"                                                         
          "10000111_9"   7 "Data Not Available"                                                     
          "10000111_9"   8 "Data Not Available"                                                     
          "10000111_9"   9 "Data Not Available"                                                     
          "10000111_9"  10 "Not Applicable"                                                         
          "10001060_1"   7 "Data Not Available"                                                     
          "10001060_1"   8 "Crop Cultivation"                                                       
          "10001060_1"   9 "Crop Cultivation"                                                       
          "10001060_1"  10 "Crop Cultivation"                                                       
          "10001060_2"   7 "Data Not Available"                                                     
          "10001060_2"   8 "Crop Cultivation"                                                       
          "10001060_2"   9 "Crop Cultivation"                                                       
          "10001060_2"  10 "Crop Cultivation"                                                       
          "10001060_3"   7 "Data Not Available"                                                     
          "10001060_3"   8 "Not Applicable"                                                         
          "10001060_3"   9 "Not Applicable"                                                         
          "10001060_3"  10 "Not Applicable"                                                         
          "10001060_4"   7 "Data Not Available"                                                     
          "10001060_4"   8 "Not Applicable"                                                         
          "10001060_4"   9 "Not Applicable"                                                         
          "10001060_4"  10 "Not Applicable"                                                         
          "10001079_1"   7 "Cement, Tiles, Bricks, Ceramics, Glass and other construction materials"
          "10001079_1"   8 "Crop Cultivation"                                                       
          "10001079_1"   9 "Crop Cultivation"                                                       
          "10001079_1"  10 "Crop Cultivation"                                                       
          "10001079_2"   7 "Not Applicable"                                                         
          "10001079_2"   8 "Not Applicable"                                                         
          "10001079_2"   9 "Not Applicable"                                                         
          "10001079_2"  10 "Not Applicable"                                                         
          "10001079_3"   7 "Not Applicable"                                                         
          "10001079_3"   8 "Not Applicable"                                                         
          "10001079_3"   9 "Not Applicable"                                                         
          "10001079_3"  10 "Not Applicable"                                                         
          "10001079_4"   7 "Not Applicable"                                                         
          "10001079_4"   8 "Not Applicable"                                                         
          "10001079_4"   9 "Not Applicable"                                                         
          "10001079_4"  10 "Not Applicable"                                                         
          "10001079_5"   7 "Not Applicable"                                                         
          "10001079_5"   8 "Not Applicable"                                                         
          "10001079_5"   9 "Not Applicable"                                                         
          "10001079_5"  10 "Not Applicable"                                                         
          end

          Comment


          • #6
            So, this is complicated because you have a multilevel outcome (INDUSTRY) that you want to compare across not 2, but 4 time periods for the same group of respondents. The "usual suspects" like Mantel-Haenszel or McNemar are not applicable due to the multiplicity of both outcomes and time periods. So what you need is a multinomial logistic regression, ideally with an ID level fixed effrect. Unfortunately, there is no such command in Stata, nor, I believe anywhere else--as far as I know nobody as ever developed a fixed-effects -mlogit- estimator. But there is a random effects -mlogit- estimator. It is not packaged as a single command in Stata but can be emulated using -gsem-.

            You will need to turn your INDUSTRY variable into a numerical variable. And I don't know if you want to consider "Not Applicable" and "Data Not Available" to be valid response categories or change them to missing variables (in the code below, I keep them as valid response categories). Also, you may have to remove categories with very small n's (like Cement, Tiles, Bricks..., and Retail Trade in your example data) if there are such categories in your full data set. Anyway, here's a start:

            Code:
            encode INDUSTRY, gen(industry)
            
            drop if inlist(industry, 1, 5) // N's TOO SMALL!
            
            gsem (industry <- period RE[ID]@1, mlogit) // RE[ID] IS A RANDOM EFFECT AT THE ID LEVEL
            Now, if you run this on your example, it will not converge. I assume that this is due to peculiarities of the data that make it very difficult to fit to the likelihood function. (-mlogit- is, in general, pretty difficult to work with, even in the simpler cases, and especially so when adding in random effects, which can be difficult in their own right.) In particular, when the data are broken down into cells by industry, and period, they are probably too small to provide enough information for the estimation of all these effects. But hopefully in the full data set, you won't have this problem.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              So, this is complicated because you have a multilevel outcome (INDUSTRY) that you want to compare across not 2, but 4 time periods for the same group of respondents. The "usual suspects" like Mantel-Haenszel or McNemar are not applicable due to the multiplicity of both outcomes and time periods. So what you need is a multinomial logistic regression, ideally with an ID level fixed effrect. Unfortunately, there is no such command in Stata, nor, I believe anywhere else--as far as I know nobody as ever developed a fixed-effects -mlogit- estimator. But there is a random effects -mlogit- estimator. It is not packaged as a single command in Stata but can be emulated using -gsem-.

              You will need to turn your INDUSTRY variable into a numerical variable. And I don't know if you want to consider "Not Applicable" and "Data Not Available" to be valid response categories or change them to missing variables (in the code below, I keep them as valid response categories). Also, you may have to remove categories with very small n's (like Cement, Tiles, Bricks..., and Retail Trade in your example data) if there are such categories in your full data set. Anyway, here's a start:

              Code:
              encode INDUSTRY, gen(industry)
              
              drop if inlist(industry, 1, 5) // N's TOO SMALL!
              
              gsem (industry <- period RE[ID]@1, mlogit) // RE[ID] IS A RANDOM EFFECT AT THE ID LEVEL
              Now, if you run this on your example, it will not converge. I assume that this is due to peculiarities of the data that make it very difficult to fit to the likelihood function. (-mlogit- is, in general, pretty difficult to work with, even in the simpler cases, and especially so when adding in random effects, which can be difficult in their own right.) In particular, when the data are broken down into cells by industry, and period, they are probably too small to provide enough information for the estimation of all these effects. But hopefully in the full data set, you won't have this problem.
              Thank you Clyde for your response. I shall run this code on the full data and respond on this thread to update you on the results.

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                So, this is complicated because you have a multilevel outcome (INDUSTRY) that you want to compare across not 2, but 4 time periods for the same group of respondents. The "usual suspects" like Mantel-Haenszel or McNemar are not applicable due to the multiplicity of both outcomes and time periods. So what you need is a multinomial logistic regression, ideally with an ID level fixed effrect. Unfortunately, there is no such command in Stata, nor, I believe anywhere else--as far as I know nobody as ever developed a fixed-effects -mlogit- estimator. But there is a random effects -mlogit- estimator. It is not packaged as a single command in Stata but can be emulated using -gsem-.

                You will need to turn your INDUSTRY variable into a numerical variable. And I don't know if you want to consider "Not Applicable" and "Data Not Available" to be valid response categories or change them to missing variables (in the code below, I keep them as valid response categories). Also, you may have to remove categories with very small n's (like Cement, Tiles, Bricks..., and Retail Trade in your example data) if there are such categories in your full data set. Anyway, here's a start:

                Code:
                encode INDUSTRY, gen(industry)
                
                drop if inlist(industry, 1, 5) // N's TOO SMALL!
                
                gsem (industry <- period RE[ID]@1, mlogit) // RE[ID] IS A RANDOM EFFECT AT THE ID LEVEL
                Now, if you run this on your example, it will not converge. I assume that this is due to peculiarities of the data that make it very difficult to fit to the likelihood function. (-mlogit- is, in general, pretty difficult to work with, even in the simpler cases, and especially so when adding in random effects, which can be difficult in their own right.) In particular, when the data are broken down into cells by industry, and period, they are probably too small to provide enough information for the estimation of all these effects. But hopefully in the full data set, you won't have this problem.
                I dropped Not Applicable and Data Not Available.
                I ran this code on the full data data set but from Iteration 12 onward, it showed Not Concave
                Code:
                encode INDUSTRY_OF_OCCUPATION, gen(industry)
                
                .
                .
                . drop if inlist(industry, 1, 5) // N's TOO SMALL!
                (135,965 observations deleted)
                
                .
                . gsem (industry <- period RE[ID]@1, mlogit) // RE[ID] IS A RANDOM EFFECT AT THE ID LEVEL
                
                Fitting fixed-effects model:
                
                Iteration 0:   log likelihood = -1625106.4  
                Iteration 1:   log likelihood = -1621590.4  
                Iteration 2:   log likelihood = -1619517.1  
                Iteration 3:   log likelihood = -1619067.8  
                Iteration 4:   log likelihood = -1618901.1  
                Iteration 5:   log likelihood = -1618870.4  
                Iteration 6:   log likelihood = -1618862.7  
                Iteration 7:   log likelihood =   -1618861  
                Iteration 8:   log likelihood = -1618860.6  
                Iteration 9:   log likelihood = -1618860.5  
                Iteration 10:  log likelihood = -1618860.5  
                Iteration 11:  log likelihood = -1618860.5  
                Iteration 12:  log likelihood = -1618860.5  (not concave)
                Iteration 13:  log likelihood = -1618860.5  (not concave)
                Iteration 14:  log likelihood = -1618860.5  (not concave)
                Iteration 15:  log likelihood = -1618860.5  (not concave)
                Iteration 16:  log likelihood = -1618860.5  (not concave)
                Iteration 17:  log likelihood = -1618860.5  (not concave)
                Iteration 18:  log likelihood = -1618860.5  (not concave)
                Iteration 19:  log likelihood = -1618860.5  (not concave)
                Iteration 20:  log likelihood = -1618860.5  (not concave)
                I also ran the code with fewer categories,focussing on the major ones. but got stuck in this phase.
                Last edited by Titir Bhattacharya; 20 Aug 2020, 00:30.

                Comment


                • #9
                  There may or may not be a solution to this problem. But here's how I would start. Rerun the -gsem- command, but add the -iterate(15)- option. That will take you to the place where it got stuck, and that it will generate output that shows what the estimation looks like at that point. That output is not usable as results, but it can often point you to the source of non-convergence. Examine it for coefficients or standard errors that are obviously wrong. Then try either specifying start values that are close to what you expect to get (which you might find by just running -mlogit- on the data, not taking random effects into account), or perhaps eliminating those variables entirely.

                  But before you do that, I realized that I didn't specify the model correctly. The code I wrote looks for a linear time trend, whereas you really want a comparison of 4 discrete time periods. So change period to i.period in the -gsem- command. That might run with no difficulties. If it doesn't, apply the approach from the previous paragraph to it, with the -iterate()- option specifying a number of iterations that takes you a few observations into the point where it gets stuck.

                  One more thing: I see you kept my -drop if inlist(industry, 1, 5)- command. But that's probably inappropriate in your real data. As the comment on that command says, I did it because the n's for those values were too small. But when you used it on your full data, it dropped nearly 136,000 observations: those n's are clearly not too small. So you should only drop those values of industry if you really want to exclude those categories from the analysis. Make sure you check the value label for industry to be sure which industries those values stand for: it probably will be different in your full data from what it was in the example data you gave.
                  Last edited by Clyde Schechter; 20 Aug 2020, 11:42.

                  Comment

                  Working...
                  X