Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linear Programming in Mata with panel data - looping over individuals, specifying objective function

    I try to use linear programming in Mata. I am new to Mata.

    *################################################# # IMPUTATION OF EDUCATION
    I would like to assign an education category (z,s,v = low, middle, high) to each individual according to probabilities obtained previously by odered probit. I also have a marginal distribution for each category in the population.

    P* are my probabilities for each category, R* are ranks from highest to the lowest probability, Q* is a number of individuals in each educational category in the population


    *################################################# # MATH
    I would like to optimize something like that:

    Objective function minimising the ranks = choose the most probable education category subject to marginal distribution:
    min ∑ (RzXz + RsXs + RvXv)

    Equality constraints:
    X* should be 0 or 1
    Xz + Xs + Xv = 1 // I know that I do not get discrete solution however I can later rewrite it in STATA and change the highest X into 1 and others to zeros
    ∑Xz = Qz
    ∑Xs = Qs
    ∑Xv = Qv


    *################################################# # MY PROBLEMS
    Since I have to run it over each individual I reffered to Sebastians question here:
    https://www.statalist.org/forums/for...e-observations

    I obtain two error messages:

    invalid dimension of equality constraints
    The equality constraints system matrix must have the same rows as the right hand side.
    r(3200);

    and

    too few variables specified for matrix X
    r(102);

    and of course X is an empty matrix.

    *################################################# # QUESTIONS
    For Q* I am aware that I should specify a scalar instead of a vector of the same number, that was just my variable from STATA I merged with the distributional data but not sure hot to do that (local marcos did not work out).

    Other than that I do not know where to start to solve my problem. I would like to obtain a matrix of three variables X telling me which category has to be choosen subject to marginal distribution.

    I would be thankful for any advice/hints, what to consider.


    *################################################# # EXAMPLE
    My dataset:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(Pz Ps Pv Qz Qs Qv) byte(Rz Rs Rv)
     .0226813 .8045667    .172752 5956.511 72222.695 5336.041 3 1 2
    .09679342 .8531785  .05002809 5956.511 72222.695 5336.041 2 1 3
    .06026781 .8577851   .0819471 5956.511 72222.695 5336.041 3 1 2
    .15252922 .8199771 .027493654 5956.511 72222.695 5336.041 2 1 3
    .24786794 .7403268 .011805267 5956.511 72222.695 5336.041 2 1 3
    end
    My attempt:
    Code:
    putmata Pz Ps Pv Qz Qs Qv Rz Rs Rv, replace
    mata
    lp = LinearProgram()
                    
                    P = (Pz, Ps, Pv)                // probabilities
                    Q = (Qz, Qs, Qv)                // marginal distributions
                    R = ( Rz, Rs, Rv)                // ranks
                    
                    m = cols(R)
                    n = rows(R)
                    
    for (i=1; i<=n; i++) {
      // ----------------------------------------------------
                                                    // coefficients
                    
                    X = J(m, n , .)
                    coefs = (sum(Rz), sum(Rs), sum(Rv))
                    
                                                    // equality constraints
                    eq_lhs =  (1 , 1, 1) ;    eq_rhs = 1
                    eq_l = sum(X) ; eq_r = Q
                                                   
                                                    // bounds
                    lower =   (0,    0,     0)
                    upper =   (1,    1,     1)
    
    
    // ----------------------------------------------------
                                                 // set class
                    
                    lp.setMaxOrMin("min") 
                    lp.setCoefficients(coefs)
                    lp.setEquality( eq_lhs,   eq_rhs)
                    lp.setEquality(    eq_l,   eq_r)
                    lp.setBounds(lower, upper)
    
                   // ----------------------------------------------------
                                                // solve linear program
    
                    lp.optimize()
                    lp.parameters()
                    x[i,.] = lp.parameters()      // Store parameter
    }
                    // ----------------------------------------------------
                                                // done
    
                    // Return solution as vector (x, value_of_objective_function)
    
                    X
                    st_matrix("coefs", X) // export to STATA
    end
    getmata X

  • #2
    I modified my code and objective function:
    • took the sum of Xs out of the loop,
    • replaced ranks with probabilities - actually I do not need them, I can maximize the probabilities instead of minimising the ranks
    • also changed the input data that the marginal distribution can re reached with 10 first observations in my example
    PROBLEM: I would like to assign an education category (z,s,v = low, middle, high) to each individual according to probabilities obtained previously by odered probit. I also have a marginal distribution for each category in the population.

    Objective function maximising the probability = choose the most probable education category subject to marginal distribution:
    max ∑ (PzXz + PsXs + PvXv)

    X should be 1 or 0
    Xz + Xs + Xv = 1

    ∑Xz = Qz, ∑Xs = Qs, ∑Xv = Qv

    My Code is now:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(Pz Ps Pv Qz Qs Qv) 
     .0226813 .8045667    .172752 3 5 2
    .09679342 .8531785  .05002809 3 5 2
    .06026781 .8577851   .0819471 3 5 2
    .15252922 .8199771 .027493654 3 5 2
    .24786794 .7403268 .011805267 3 5 2
     .0226813 .8045667    .172752 3 5 2
    .09538278 .8537293   .0508879 3 5 2
    .05995392 .8576999  .08234616 3 5 2
    .08492604 .8570921  .05798185 3 5 2
    .09858042 .8524512  .04896835 3 5 2
    end

    Code:
    putmata Pz Ps Pv Qz Qs Qv, replace
    mata
    lp = LinearProgram()
                    
                    P = (Pz, Ps, Pv)                // probabilities
                    Q = (Qz, Qs, Qv)                // marginal distributions
                    m = cols(P)
                    n = rows(P)
    
                    X = J(m, n , .)
                    eq_l = sum(X) ; eq_r = Q 
                    // has to be out of the loop                                          
                                                    // bounds
                                                    
    for (i=1; i<=n; i++) {
      // ----------------------------------------------------
                                                    // coefficients
                    
                    
                    coefs = (Pz, Ps, Pv)
                    
                                                    // equality constraints
                    eq_lhs =  (1 , 1, 1) ;    eq_rhs = 1
                    
                    lower =   (0,    0,     0)
                    upper =   (1,    1,     1)
    
    
    // ----------------------------------------------------
                                                 // set class
                    
                    lp.setMaxOrMin("max") 
                    lp.setCoefficients(coefs)
                    lp.setEquality( eq_lhs,   eq_rhs)
                    lp.setEquality(  eq_l,   eq_r)
                    lp.setBounds(lower, upper)
    
                   // ----------------------------------------------------
                                                // solve linear program
    
                    lp.optimize()
                    lp.parameters()
                    X[i,] = lp.parameters()      // Store parameter
    }
                    // ----------------------------------------------------
                                                // done
    
                    // Return solution as vector (x, value_of_objective_function)
    
                    X
                    st_matrix("coefs", X) // export to STATA
    end
    //getmata X
    I now obtain an error message: invalid dimension of coefficients The coefficients must be a rowvector with length at least 1.

    I still think that I should store Q* as scalars and not as vectors from the previous variable but do not know how to do it. Also struggeling to be sure if this way actually does what I intend to do.

    The erros comes after this line
    Code:
     X[i,.] = lp.parameters()   // Store parameter
    Last edited by Katharina Rabe; 19 Oct 2023, 05:00. Reason: I updated the objective function my code

    Comment

    Working...
    X