The LPLS (linear problems through least squares) estimator in lppinv, tmpinv, and tmpinvi commands

Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#1

The LPLS (linear problems through least squares) estimator in lppinv, tmpinv, and tmpinvi commands

22 Feb 2024, 05:48

lppinv, tmpinv, and tmpinvi (downloadable from the SSC) implement the proposed LPLS (linear programming through least squares) estimator with the help of the Moore-Penrose inverse (pseudoinverse), calculated using *singular value decomposition (SVD)*, with emphasis on the estimation of OLS constrained in values (cOLS), Transaction Matrix (TM), and custom (user-defined) cases. The pseudoinverse offers a unique minimum-norm least-squares solution, which is the best linear unbiased estimator (BLUE); see Albert (1972, Chapter VI). (Over)determined problems are accompanied by regression analysis, which is feasible in their case. For such and especially all remaining cases, a Monte Carlo-based ttest of mean NRMSE (normalized by the standard deviation of the RHS) is performed, the sample being drawn from a uniform or user-provided distribution (via a *Python function*).

OLS constrained in values (cOLS) is an estimation based on constraints in the model and/or data but not in parameters. Typically, such models are of size ≤ kN, where N is the number of observations, since the number of their constraints may vary in the LHS (e.g., level, derivatives, etc.).

Example of a cOLS problem:
Estimate the trend and the cyclical component of a country's GDP given the textbook or any other definition of its peaks, troughs, and saddles. For a pre-LPLS approach to this problem, see (Bolotov, 2014).

Transaction Matrix (TM) of size (M x N) is a formal model of interaction (allocation, assignment, etc.) between M and N elements in any imaginable system, such as intercompany transactions (netting tables), industries within/between economies (input-output tables), cross-border trade/investment (trade/investment matrices), etc., where row and column sums are known, but individual elements of the TM may not be:

- a netting table is a type of TM where M = N and the elements are subsidiaries of a MNC;
- an input-output table (IOT) is a type of TM where M = N and the elements are industries;
- a matrix of trade/investment is a type of TM where M = N and the elements are countries or (macro)regions, where diagonal elements may be equal to zero;
- a country-product matrix is a type of TM where M ≠ N and the elements are of different types;
...

Example of a TM problem:
Estimate the matrix of trade/investment with/without zero diagonal elements, the country shares in which are unknown. For a pre-LPLS approach to this problem, see (Bolotov, 2015).

Methods and formulas:
The LP problem in the LPLS estimator is a matrix equation a @ x = b, loosely based on the structure of the Simplex tableau, where a consists of coefficients for CONSTRAINTS, LP-type CHARACTERISTIC and/or SPECIFIC, and for SLACK/SURPLUS VARIABLES (the upper part) as well as for the MODEL (the lower part), as illustrated in Figure 1. Each part of a can be omitted to accommodate a particular case:

- cOLS problems require SPECIFIC CONSTRAINTS, no LP-type CHARACTERISTIC CONSTRAINTS, and a MODEL;
- TM requires LP-type CHARACTERISTIC CONSTRAINTS, no SPECIFIC CONSTRAINTS, and an optional MODEL;
- SLACK/SURPLUS VARIABLES are included only for inequality constraints and should be set to 1 or -1;
...

Figure 1: Matrix equation a @ x = b

CONSTRAINTS: CHARACTERISTIC/SPECIFIC SL/SU VARIABLES CONSTRAINTS

MODEL MODEL

Source: self-prepared

The solution to the equation, x = pinv(a) @ b, is estimated with the help of SVD and is a minimum-norm least-squares generalized solution if the rank of a is not full. To check if a is within the computational limits, its (maximum) dimensions can be calculated using the formulas:

- (k \* N) x (K + K\*) - cOLS - without slack/surplus variables;
- (k \* N) x (K + K\* + l) - cOLS - with slack/surplus variables;
- (M + N) x (M \* N) - TM - without slack/surplus variables;
- (M + N) x (M \* N + l) - TM - with slack/surplus variables;
- M x N - custom - without slack/surplus variables;
- M x (N + l) - custom - with slack/surplus variables;

where, in cOLS problems, K is the number of independent variables in the model (including the constant), K\* is the number of eventual extra variables in CONSTRAINTS, and N is the number of observations; in TM, M and N are the dimensions of the matrix; and in custom cases, M and N or M x (N + l) are the dimensions of a.

Example:
Task: Estimate the matrix of trade matrix with zero diagonal elements, the country shares in which are unknown:
We will use the multistep program tmpinvi for this purpose:

Code:

. clear . * Let's make a dataset using Mata code and print it . set obs 10 . gen str13 country_id = "" . mata: st_local("b", "Country #"); st_local("e", "Rest of World") . mata: st_sstore(., "country_id", ("`b'":+strofreal(1::9)\"`e'")) . gen float exports = runiform(0, 1000) . gen float imports = runiform(0, 1000) . list . * OLS estimation (possible for TM because of their estimation method): . tmpinvi exports imports, zerod adj(ave) l(0) u(1000) . matlist r(solution) . return list . * Minimum-norm least-squares generalized solution: . tmpinvi exports imports, zerod adj(ave) subm(10) l(0) u(1000) . matlist r(solution) . return list

NB Please consult the tmpinvi help file for the specifics of the TM problems (they have a "characteristic matrix", estimation via contiguous submatrices, pre-compiled Monte Carlo-based t-test samples, etc.).

References:
Albert, A., 1972. Regression And The Moore-Penrose Pseudoinverse. New York: Academic Press.

Bolotov, I. 2014. Modeling of Time Series Cyclical Component on a Defined Set of Stationary Points and its Application on the US Business Cycle. [Paper presentation]. The 8th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2014/articl...Ilya-paper.pdf

Bolotov, I. 2015. Modeling Bilateral Flows in Economics by Means of Exact Mathematical Methods. [Paper presentation]. The 9th International Days of Statistics and Economics: Prague. https://msed.vse.cz/msed_2015/articl...Ilya-paper.pdf

PS Please also check the Web of Science (WoS) for new research on LPLS. I will soon have some research papers published.
Tags: None

2 likes
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#2

22 Feb 2024, 07:16

There is a Python version of the lppinv program with the exactly same functionality (downloadable from PyPi):
https://pypi.org/project/lppinv/
It uses scipy and statsmodels for the SVD-based estimation, t-tests, and OLS.

Code:

import numpy as np import lppinv as lp # cOLS problem print('\n', lp.solve( cols=True, rhs=np.array([[0, 0, 0, 0, -1, 0.2, 0.9, 2.1]]).T, constraints=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T, model=np.array([np.ones(4), [0, 1, 2, 3], [0, 5, 2, 8]]).T, slackvars=np.array([[-1, 1, -1, 0]]).T, mc=False )) # TM problem def rnormal(r=None, c=None): return (np.random.normal(loc=0.0, scale=1.0, size=(r, c))) print('\n', lp.solve( tm=True, rhs=np.array([[4, 5, 3, 4, 6], [1, 2, 0, np.nan, np.nan]]).T, zero_diagonal=True, distribution=rnormal )) # custom problem print('\n', lp.solve( rhs=np.array([[2, 3, 9], [5, 7, 9]]).T, model=np.vstack([[0, 1, 1], [1, 0, 1], [1, 1, 0]]), tolerance=10**-10, trace=False ))

lp.solve() returns:

class lp.LPpinvResult(
OLSResults : statsmodels.regression.linear_model.OLSResults or None, statsmodels' results class for for an OLS model

TtestResult : (scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult, scipy.stats._result_classes.TtestResult) or(None, None, None), scipy's result of a t-test

solution : array_like, the solution obtained from the LPLS estimator

a : array_like, the a matrix

nrmse : float, root mean square error normalized by the standard deviation of b

r2_c : float or np.nan, R-squared for CONSTRAINTS in TM

)

Last edited by Ilya Bolotov; 22 Feb 2024, 07:21.
Comment

CONSTRAINTS: CHARACTERISTIC/SPECIFIC	SL/SU VARIABLES	CONSTRAINTS
MODEL		MODEL

Announcement

The LPLS (linear problems through least squares) estimator in lppinv, tmpinv, and tmpinvi commands

Comment