Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trying to get unique pairs for industries but maintaining the output value

    Hi All,
    I am working with a dataset that provides information on inputs and outputs. Please see the data example below.
    I am trying to create unique pairs of naics_users and naics_suppliers using the collapse command as follows:

    collapse (sum) tag value (mean) total_interm_user (mean) total_output_user, by(naics_user naics_title_user naics_supplier naics_title_supplier code_user)

    However, after this collapse, when I check the total_output_user variable, I notice that it is not always the same. The total output user should always be coming from the code_user variable (before the collapse). I suspect that the issue is that one naics_user is related to multiple code_users. If that is the case, I want total_output_user variable to be aggregated or summed. If that is not true, then I want to have the total_output_user variable as the mean for those observations.

    Here is the dataset before the collapse command.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str22 code_supplier str96 description_supplier str7 code_user str96 description_user str119 naics_title_supplier
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "Administration of General Economic Programs "                                        
    "GSLGO"  "State and local government (other services)" "230302" "Residential maintenance and repair"    "Regulation and Administration of Transportation Programs "                           
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "Regulation and Administration of Transportation Programs "                           
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "Regulation and Administration of Communications, Electric, Gas, and Other Utilities "
    "GSLGO"  "State and local government (other services)" "230302" "Residential maintenance and repair"    "Regulation and Administration of Communications, Electric, Gas, and Other Utilities "
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "Regulation of Agricultural Marketing and Commodities "                               
    "GSLGO"  "State and local government (other services)" "230302" "Residential maintenance and repair"    "Regulation of Agricultural Marketing and Commodities "                               
    "GSLGO"  "State and local government (other services)" "230302" "Residential maintenance and repair"    "Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors "          
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "Regulation, Licensing, and Inspection of Miscellaneous Commercial Sectors "          
    "S00102" "Other federal government enterprises"        "230302" "Residential maintenance and repair"    "Space Research and Technology "                                                      
    "S00500" "Federal general government (defense)"        "230302" "Residential maintenance and repair"    "National Security "                                                                  
    "S00600" "Federal general government (nondefense)"     "230302" "Residential maintenance and repair"    "International Affairs "                                                              
    "1111A0" "Oilseed farming"                             "233230" "Manufacturing structures"              "Soybean Farming"                                                                     
    "1111A0" "Oilseed farming"                             "230301" "Nonresidential maintenance and repair" "Soybean Farming"                                                                     
    "1111A0" "Oilseed farming"                             "230301" "Nonresidential maintenance and repair" "Oilseed (except Soybean) Farming "                                                   
    "1111A0" "Oilseed farming"                             "233230" "Manufacturing structures"              "Oilseed (except Soybean) Farming "                                                   
    "1111B0" "Grain farming"                               "233230" "Manufacturing structures"              "Dry Pea and Bean Farming "                                                           
    "1111B0" "Grain farming"                               "230301" "Nonresidential maintenance and repair" "Dry Pea and Bean Farming "                                                           
    "1111B0" "Grain farming"                               "233230" "Manufacturing structures"              "Wheat Farming"                                                                       
    "1111B0" "Grain farming"                               "230301" "Nonresidential maintenance and repair" "Wheat Farming"                                                                       
    end

    I will be grateful for your help. Thanks in advance!

    Regards,
    Preety

  • #2
    Two things would help.

    1. include all variables in your collapse command in the dataex.
    2. collapse it and point to the problem.

    I would think "(sum) total_output_user" might get you there, as the sum and mean of 1 unit are the same. And, you may be collapsing "by" too many items.
    Hard to say with this dataex.

    Comment


    • #3
      Thanks, George. dataex in not able to generate an example with all my variables. Is there a way to show that?

      Comment


      • #4
        dump the descriptions. redundant to code.

        Comment

        Working...
        X