Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • matrix


    Dear all,

    I would like to construct a matrix whose elements are equal to 1 if two firms belong to the same industry, otherwise 0. In addition, diagonal elements would need to be set to zero.

    I have firm-level dataset counting 2007 firms. Hence my matrix should be of 2007 x 2007 size. My industry level variable is a three digit variable, which looks like this..

    Firm industry
    1 354
    2 384
    3 135
    4 274
    5 274
    ...
    2007 354
    I tried using spmat command :
    spmat contiguity industry using "filedirectory", id(industry)
    However the message I get is

    "industry values must be unique"
    Maybe a loop function needs to be used, which I struggle creating?

    I hope someone could help me with this.

    Thank you.

    Mina
    Last edited by sladmin; 27 Oct 2015, 10:03. Reason: user request

  • #2
    Your problem amounts to pairing all firms within the same industry. This is easily done with joinby. You can then reshape the data to wide form to create the matrix. Since identifiers and industry codes are usually string variables, here's an example that uses strings.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10(firm industry)
    "40000020" "354"
    "40000035" "384"
    "40000038" "135"
    "40000039" "274"
    "40000044" "274"
    "40000045" "384"
    "40000261" "274"
    end
    
    tempfile f
    save "`f'"
    
    * form all pairwise combinations of firms within the same industry
    rename firm firm2
    joinby industry using "`f'"
    
    * all pairs of firm id are in the same industry but ignore self
    gen c = 1
    replace c = 0 if firm == firm2
    
    * reshape to wide form
    reshape wide c, i(firm) j(firm2) string
    
    * missing combinations are not in the same industry
    mvencode _all, mv(0) override
    
    mkmat c*, mat(C) rown(firm)
    mat list C

    Comment


    • #3
      Robert Picard , many thanks for this response!

      I am trying to apply your instructions to my dataset. But when I run:
      . tempfile f

      . save "`f'"
      file /var/folders/gl/gl4XR7+zHIayxvWCE9Opyk+++TI/-Tmp-//S_00301.000002 saved
      joinby industry using "`f'"
      i get the following mistake:
      invalid file specification
      This prevents me for further progress..

      What am I doing wrong?

      Comment


      • #4
        It's probably because you are running parts of the code at a time. The simplest is switch to a permanent file name for the data. Here's a revised example:

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str10(firm industry)
        "40000020" "354"
        "40000035" "384"
        "40000038" "135"
        "40000039" "274"
        "40000044" "274"
        "40000045" "384"
        "40000261" "274"
        end
        
        save "dummy_data.dta", replace
        
        * form all pairwise combinations of firms within the same industry
        rename firm firm2
        joinby industry using "dummy_data.dta"
        
        * all pairs of firm id are in the same industry but ignore self
        gen c = 1
        replace c = 0 if firm == firm2
        
        * reshape to wide form
        reshape wide c, i(firm) j(firm2) string
        
        * missing combinations are not in the same industry
        mvencode _all, mv(0) override
        
        mkmat c*, mat(C) rown(firm)
        mat list C

        Comment


        • #5
          Robert, thank you! This fixed the issue, but only when I have dataset consisting of two variables, industry variables and firm ID (similar to dummy_data.dta). However, in my real dataset i have 100 of other variables(which have nothing to do with the matrix construction). Under such setup, I get a warning message on this step:

          [QUOTE] reshape wide c, i(firm) j(firm2) string/QUOTE]

          the error message is:
          [QUOTE]variable not constant within firm
          Type "reshape error" for a listing of the problem observations.
          r(9);
          /QUOTE]

          Stata repeats this message for each variable.

          Does this mean that I can only construct a matrix in a separate dataset, consisting only of variables of interest for matrix creation?
          In addition, would it be possible to standardize this matrix in a way that each row sums to one?

          Thank you so much for all the help!

          Comment


          • #6
            Yes, create the matrix separately; just keep firm and industry and create the matrix with those variables only. You can always merge the data back by firm identifier.

            You won't be able to create a matrix that adds to 1 on each row if you have firms that are alone in the sector. Here's a modified example that weighs the measure by the number of other firms in the industry:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str10(firm industry)
            "40000020" "274"
            "40000035" "384"
            "40000038" "135"
            "40000039" "274"
            "40000044" "274"
            "40000045" "384"
            "40000261" "274"
            end
            
            save "dummy_data.dta", replace
            
            * form all pairwise combinations of firms within the same industry
            rename firm firm2
            joinby industry using "dummy_data.dta"
            
            * The number of other firms in the industry
            bysort firm (firm2): gen N = _N - 1
            gen a = 1 / N
            replace a = 0 if firm == firm2
            
            * reshape to wide form
            reshape wide a, i(firm) j(firm2) string
            
            * missing combinations are not in the same industry
            mvencode _all, mv(0) override
            
            mkmat a*, mat(A) rown(firm)
            mat list A, nohalf

            Comment


            • #7
              Robert, this is exactly what I need. I first deleted observations (15 firms) which did not have a "neighboring" firm within an industry, and then followed with your commands.

              However, when I try to use this matrix in my spatial model, it does not work.

              I get the following msg

              [QUOTE]
              ================================================== ============================
              *** Binary (0/1) Weight Matrix: 1992x1992 (Non Normalized)
              ================================================== ============================

              *** Observations have (1445) Missing Values
              *** You can use zero option to Convert Missing Values to Zero
              /QUOTE]

              I am not really sure what zero means here, but results are not displayed..

              Many many thanks for all the help, which is surely more than I expected from this post.

              Comment


              • #8
                Sorry but what to do with the matrix is beyond my pay grade. Perhaps someone else with experience with this type of analysis can help Mina.

                Comment

                Working...
                X