lppinv, tmpinv, and tmpinvi (downloadable from the SSC) implement the proposed LPLS (linear programming through least squares) estimator with the help of the Moore-Penrose inverse (pseudoinverse), calculated using *singular value decomposition (SVD)*, with emphasis on the estimation of OLS constrained in values (cOLS), Transaction Matrix (TM), and custom (user-defined) cases. The pseudoinverse offers a unique minimum-norm least-squares solution, which is the best linear unbiased estimator (BLUE); see Albert (1972, Chapter VI). (Over)determined problems are accompanied by regression analysis, which is feasible in their case. For such and especially all remaining cases, a Monte Carlo-based ttest of mean NRMSE (normalized by the standard deviation of the RHS) is performed, the sample being drawn from a uniform or user-provided distribution (via a *Python function*).
OLS constrained in values (cOLS) is an estimation based on constraints in the model and/or data but not in parameters. Typically, such models are of size ≤ kN, where N is the number of observations, since the number of their constraints may vary in the LHS (e.g., level, derivatives, etc.).
Example of a cOLS problem:
Estimate the trend and the cyclical component of a country's GDP given the textbook or any other definition of its peaks, troughs, and saddles. For a pre-LPLS approach to this problem, see (Bolotov, 2014).
Transaction Matrix (TM) of size (M x N) is a formal model of interaction (allocation, assignment, etc.) between M and N elements in any imaginable system, such as intercompany transactions (netting tables), industries within/between economies (input-output tables), cross-border trade/investment (trade/investment matrices), etc., where row and column sums are known, but individual elements of the TM may not be:
- a netting table is a type of TM where M = N and the elements are subsidiaries of a MNC;
- an input-output table (IOT) is a type of TM where M = N and the elements are industries;
- a matrix of trade/investment is a type of TM where M = N and the elements are countries or (macro)regions, where diagonal elements may be equal to zero;
- a country-product matrix is a type of TM where M ≠ N and the elements are of different types;
...
Example of a TM problem:
Estimate the matrix of trade/investment with/without zero diagonal elements, the country shares in which are unknown. For a pre-LPLS approach to this problem, see (Bolotov, 2015).
Methods and formulas:
The LP problem in the LPLS estimator is a matrix equation a @ x = b, loosely based on the structure of the Simplex tableau, where a consists of coefficients for CONSTRAINTS, LP-type CHARACTERISTIC and/or SPECIFIC, and for SLACK/SURPLUS VARIABLES (the upper part) as well as for the MODEL (the lower part), as illustrated in Figure 1. Each part of a can be omitted to accommodate a particular case:
- cOLS problems require SPECIFIC CONSTRAINTS, no LP-type CHARACTERISTIC CONSTRAINTS, and a MODEL;
- TM requires LP-type CHARACTERISTIC CONSTRAINTS, no SPECIFIC CONSTRAINTS, and an optional MODEL;
- SLACK/SURPLUS VARIABLES are included only for inequality constraints and should be set to 1 or -1;
...
Figure 1: Matrix equation a @ x = b
Source: self-prepared
The solution to the equation, x = pinv(a) @ b, is estimated with the help of SVD and is a minimum-norm least-squares generalized solution if the rank of a is not full. To check if a is within the computational limits, its (maximum) dimensions can be calculated using the formulas:
- (k \* N) x (K + K\*) - cOLS - without slack/surplus variables;
- (k \* N) x (K + K\* + l) - cOLS - with slack/surplus variables;
- (M + N) x (M \* N) - TM - without slack/surplus variables;
- (M + N) x (M \* N + l) - TM - with slack/surplus variables;
- M x N - custom - without slack/surplus variables;
- M x (N + l) - custom - with slack/surplus variables;
where, in cOLS problems, K is the number of independent variables in the model (including the constant), K\* is the number of eventual extra variables in CONSTRAINTS, and N is the number of observations; in TM, M and N are the dimensions of the matrix; and in custom cases, M and N or M x (N + l) are the dimensions of a.
Example:
Task: Estimate the matrix of trade matrix with zero diagonal elements, the country shares in which are unknown:
We will use the multistep program tmpinvi for this purpose:
NB Please consult the tmpinvi help file for the specifics of the TM problems (they have a "characteristic matrix", estimation via contiguous submatrices, pre-compiled Monte Carlo-based t-test samples, etc.).
References:
OLS constrained in values (cOLS) is an estimation based on constraints in the model and/or data but not in parameters. Typically, such models are of size ≤ kN, where N is the number of observations, since the number of their constraints may vary in the LHS (e.g., level, derivatives, etc.).
Example of a cOLS problem:
Estimate the trend and the cyclical component of a country's GDP given the textbook or any other definition of its peaks, troughs, and saddles. For a pre-LPLS approach to this problem, see (Bolotov, 2014).
Transaction Matrix (TM) of size (M x N) is a formal model of interaction (allocation, assignment, etc.) between M and N elements in any imaginable system, such as intercompany transactions (netting tables), industries within/between economies (input-output tables), cross-border trade/investment (trade/investment matrices), etc., where row and column sums are known, but individual elements of the TM may not be:
- a netting table is a type of TM where M = N and the elements are subsidiaries of a MNC;
- an input-output table (IOT) is a type of TM where M = N and the elements are industries;
- a matrix of trade/investment is a type of TM where M = N and the elements are countries or (macro)regions, where diagonal elements may be equal to zero;
- a country-product matrix is a type of TM where M ≠ N and the elements are of different types;
...
Example of a TM problem:
Estimate the matrix of trade/investment with/without zero diagonal elements, the country shares in which are unknown. For a pre-LPLS approach to this problem, see (Bolotov, 2015).
Methods and formulas:
The LP problem in the LPLS estimator is a matrix equation a @ x = b, loosely based on the structure of the Simplex tableau, where a consists of coefficients for CONSTRAINTS, LP-type CHARACTERISTIC and/or SPECIFIC, and for SLACK/SURPLUS VARIABLES (the upper part) as well as for the MODEL (the lower part), as illustrated in Figure 1. Each part of a can be omitted to accommodate a particular case:
- cOLS problems require SPECIFIC CONSTRAINTS, no LP-type CHARACTERISTIC CONSTRAINTS, and a MODEL;
- TM requires LP-type CHARACTERISTIC CONSTRAINTS, no SPECIFIC CONSTRAINTS, and an optional MODEL;
- SLACK/SURPLUS VARIABLES are included only for inequality constraints and should be set to 1 or -1;
...
Figure 1: Matrix equation a @ x = b
CONSTRAINTS: CHARACTERISTIC/SPECIFIC | SL/SU VARIABLES | CONSTRAINTS |
MODEL | MODEL |
The solution to the equation, x = pinv(a) @ b, is estimated with the help of SVD and is a minimum-norm least-squares generalized solution if the rank of a is not full. To check if a is within the computational limits, its (maximum) dimensions can be calculated using the formulas:
- (k \* N) x (K + K\*) - cOLS - without slack/surplus variables;
- (k \* N) x (K + K\* + l) - cOLS - with slack/surplus variables;
- (M + N) x (M \* N) - TM - without slack/surplus variables;
- (M + N) x (M \* N + l) - TM - with slack/surplus variables;
- M x N - custom - without slack/surplus variables;
- M x (N + l) - custom - with slack/surplus variables;
where, in cOLS problems, K is the number of independent variables in the model (including the constant), K\* is the number of eventual extra variables in CONSTRAINTS, and N is the number of observations; in TM, M and N are the dimensions of the matrix; and in custom cases, M and N or M x (N + l) are the dimensions of a.
Example:
Task: Estimate the matrix of trade matrix with zero diagonal elements, the country shares in which are unknown:
We will use the multistep program tmpinvi for this purpose:
Code:
. clear
. * Let's make a dataset using Mata code and print it
. set obs 10
. gen str13 country_id = ""
. mata: st_local("b", "Country #"); st_local("e", "Rest of World")
. mata: st_sstore(., "country_id", ("`b'":+strofreal(1::9)\"`e'"))
. gen float exports = runiform(0, 1000)
. gen float imports = runiform(0, 1000)
. list
. * OLS estimation (possible for TM because of their estimation method):
. tmpinvi exports imports, zerod adj(ave) l(0) u(1000)
. matlist r(solution)
. return list
. * Minimum-norm least-squares generalized solution:
. tmpinvi exports imports, zerod adj(ave) subm(10) l(0) u(1000)
. matlist r(solution)
. return list
NB Please consult the tmpinvi help file for the specifics of the TM problems (they have a "characteristic matrix", estimation via contiguous submatrices, pre-compiled Monte Carlo-based t-test samples, etc.).
References:
- Albert, A., 1972. Regression And The Moore-Penrose Pseudoinverse. New York: Academic Press.
- Bolotov, I. 2014. Modeling of Time Series Cyclical Component on a Defined Set of Stationary Points and its Application on the US Business Cycle. [Paper presentation]. The 8th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2014/articl...Ilya-paper.pdf
- Bolotov, I. 2015. Modeling Bilateral Flows in Economics by Means of Exact Mathematical Methods. [Paper presentation]. The 9th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2015/articl...Ilya-paper.pdf
Comment