Least Squares Optimization
leastSquares: Least Squares
Subpackage which provide some methods for linear and nonlinear least squares problems, e.g:
and
for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).
Right now we can solve this kind of problems with the following methods.
-
- Linear Least Squares
-
- linear least squares (solving normal equation)
-
- Nonlinear Least Squares
-
- Gauss-Newton algorithm with several choices.
- Levenberg-Marquardt with line-search or trust-region algorithm.
Documentation is available in the docstrings and online here.
Methods
oppy.leastSquares.linear_least_square module
Linear Least Squares Algorithm.
This Module contains the function linear_least_square() and all
helper functions to solve the problem of finding
a solution of the linear least squares problem, e.g. solve:
-
oppy.leastSquares.linear_least_square.linear_least_square(A, b, options=None) -
Linear Least Squares.
This function solves the normal equation, e.g. a linear optimization problem of the form
\[\Vert Ax' - b \Vert_2 = \min \Vert Ax - b \Vert_2\]by using either Cholesky decomposition, QR decomposition or singular value decomposition. Matrices in sparse formate are not supported. In case you need a least squares function for sparse matrices anyway
scipy.sparse.linalg.lsqrmay be the desired function.
Parameters: -
A (
numpy.ndarray, shape (m,n)) – Matrix A of the least squares problem. -
b (
numpy.ndarray, shape (m,)) – RHS b of the least squares problem. -
options (
Options, optional) –Options for simplex, represented as a
Optionsobject. The default isNone. Possible options are:-
- disp
bool, optional - If
dispis True the method will show some results during the process. The default is False.
- disp
-
- fargs
str, optional - Fargs containg information about decomposition method, if
fargs=None then the QR decomposition is performed
'chol'Cholesky decomposition'QR'QR decomposition'svd'sigular value decomposition. The default is None.
- fargs
-
Returns: x – Solution of the least squares problem.
Return type: numpy.ndarray, shape (n,)Examples
In this example, a problem with a large dense matrix is solved.
>>> import numpy as np >>> from oppy.leastSquares import linear_least_square >>> m = 500 >>> n = 50 >>> A = np.random.rand(m, n) >>> b = np.random.rand(m) >>> x = linear_least_square(A, b) >>> print(np.linalg.norm((A.dot(x) - b))) # may vary >>> 6.143064057239379
References
W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].
- Luik. “Numerik 1”. Lecture Notes, see Luik [Lui10].
-
A (
-
oppy.leastSquares.linear_least_square.bidiagonal_decomposition(A) -
Determine the bidiagonal decomposition of a matrix.
This function determines the bidiagonal decomposition of A s.t.
\[A = QL * B * QR.\]Parameters: A ( numpy.ndarray, shape (m,n)) – matrix A.Returns: res – Contains [QL, A, QR] in a list. Return type: list
-
oppy.leastSquares.linear_least_square.singular_value_decomposition(A) -
Calculate the singular value decomposition of a matrix A.
This function calculates the singular value decomposition of a matrix A such that
\[U^TAV = \Sigma,\]where U and V are orthogonal matrices and Sigma is a diagonal matrix containing the singular values of A.
Parameters: A ( numpy.ndarray, shape (m,n)) – matrix A.Returns: -
V (
numpy.ndarray, shape (n,n)) – matrix V of SVD -
sigma (
numpy.ndarray, shape (m,n)) – singular values -
U (
numpy.ndarray, shape (m,m)) – matrix U of SVD
-
V (
-
oppy.leastSquares.linear_least_square.singular_value_solution(A, b) -
Solve lsq problem using SVD.
This function solves the linear problem
\[||Ax' - b|| = \min ||Ax - b||\]by using the singular value decomposition similar to chapter 4.7 in in [1].
Parameters: -
A (
numpy.ndarray, shape (m,n)) – matrix of least squares problem. -
b (
numpy.ndarray, shape (m,)) – Rhs of least squares problem.
Returns: x – Solution of the least squares problem.
Return type: numpy.ndarray, shape (n,)References
W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].
-
A (
-
oppy.leastSquares.linear_least_square.signum(x) -
Change the signum command.
This function changes the sign-command s.t. sign(0) = 1.
Parameters: x ( numpy.ndarray, shape (n,)) – Input vector x.Returns: sigma – returns the sign(x). Return type: numpy.ndarray, shape (n,)
-
oppy.leastSquares.linear_least_square.sort(x, A) -
Sort entries of a vector in decreasing order.
This function sorts the entries of a vector x by decreasing order and changes the columns of a further matrix in corresponding order.
Parameters: -
x (
numpy.ndarray, shape (n,)) – input vector x. -
A (
numpy.ndarray, shape (m,n)) – input matrix A.
Returns: -
y (
numpy.ndarray, shape (n,)) – sorted y. -
B (
numpy.ndarray, shape (m,n)) – sorted B.
-
x (
oppy.leastSquares.nonlinear_least_square
Nonlinear Least-Squares Algorithm.
This module contains the function nonlinear_least_square() and all
helper functions to solve the minimization problem:
for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).
The residual vector \(r(x)=(r_1(x),\ldots,r_m(x))^T\) contains \(m\) smooth functions \(r_j:\mathbb{R}^n\rightarrow\mathbb{R}\). Furthermore it is assumed that \(m \geq n\) is satisfied.
-
oppy.leastSquares.nonlinear_least_square.nonlinear_least_square(fhandle, dfhandle, x, options=None) -
Nonlinear Least-Squares Algorithm.
Parameters: -
fhandle (
callable()) –The residual vector r. The objective function to be minimized is built of:
fhandle(x, *fargs) -> array_like, shape (m,).where \(x\) is an 1-D array with shape (n,) and
argsis a tuple of the fixed parameters needed to completely specify the function. -
dfhandle (
callable()) –The Jacobian matrix of the residual vector r.
dfhandle(x, *fargs) -> array_like, shape (m, n).where x is an 1-D array with shape (n,) and
argsis a tuple of the fixed parameters needed to completely specify the function. -
x (
numpy.ndarray, shape (n,)) – Initial guess. Array of real elements of size (n,), where ‘n’ is the number of independent variables. -
options (
Options, optional) –Options for the nonlinear least-squares algorithm, represented as an
Optionsobject. The default isNone. Possible options are:-
- method
str - It can be chosen which method is used to solve the nonlinear
least-squares problem. The following line search based methods are
available:
gauss_newtonandlevenberg_marquardt. The first one is the Gauss-Newton method and the latter one a Levenberg-Marquardt variant of the Gauss-Newton method. The inputlevenberg_marquardt_trust_regionapplies a Levenberg-Marquardt method within a trust-region approach. The default islevenberg_marquardt_trust_region.
- method
-
- tol_abs
float - Absolute tolerance for the termination condition, initialized with \(10^{-7}\).
- tol_abs
-
- tol_rel
float - Relative tolerance for the termination condition, initialized with \(10^{-7}\).
- tol_rel
Note
The optimization algorithm stops, if the norm of the current residual
res[k]satisfies the inequality\[\mathrm{res}[k] < \mathrm{stopping\_criteria},\]where
stopping_criteriais given by\[\mathrm{stopping\_criteria} = \mathrm{tol\_rel} \cdot \mathrm{res}[0] + \mathrm{tol\_abs},\]and
res[0]is the norm of the first residual. If the value ofstopping_criteriais bigger than \(0.1\), e.g. if the norm of the first residual is big and the relative tolerance is small, then this value is set to \(0.1\).-
- fargs
tuple - Tuple containing additional arguments for fhandle.
- fargs
-
- linesearch
callable(), - Callable function that computes a steplength. The default is
Noneand therefore initialized towolfe().
- linesearch
-
- disp
bool - If
disp == True, some details about the progress will be shown during the optimization
- disp
-
- norm
callable() - Callable function that computes the norm of the
gradient \(\nabla f\) in a vector \(x\). The default is
Noneand therefore initialized to the standard norm in \(\mathbb{R}^n\), i.e.\[\mathrm{norm}: x \mapsto ||x||_2\]Note
If you want to change the norm, please remember, that you have to change the
inner_prodfunction in your linesearch algorithm, such thatnormis computed by\[\mathrm{norm} : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]
- norm
-
- solve_linear
str - Algorithm that is choosen for solving the linear system\[J^T(x_k)J(x_k)p = -J(x_k)^Tr(x_k) : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]
to get the search direction \(p_k\). It is initialized with
qr(QR-factorization) (alsochol(Cholesky factorization) andsvd(Single Value Decomposition) are available).
- solve_linear
-
- delta
float - Initial trust-region radius. The default is 100.
- delta
-
- delta_max
float - Upper bound for the trust-region radius. The default is \(10^{12}\).
- delta_max
-
- eta
float - Threshold which decides whether the step is taken. It has to be chosen in \((0,1)\). The default is 0.0001.
- eta
-
- sigma
float - Possible deviation from the trust-region constraint. The default is 0.1.
- sigma
-
- scaling
bool - If set to
Truethen the scaling of the problem is incorporated. The default isFalse.
- scaling
-
Returns: results – Instance of the class
Results. It represents the optimization results. Attributes of the object depend on the choice of theoptions.resultsparameter. With default choice the object has the following attributes:-
- x
numpy.ndarray, shape (n,) -
Solution vector of the optimization problem.
- x
-
- iterations
int -
Number of iterations the algorithm needed.
- iterations
-
- res
list -
List with the norm of the residuals.
- res
Return type: Examples
Minimizing the Rosenbrock function.
>>> from oppy.tests.costfunctions import rosenbrock_least_square,\ ... gradient_rosenbrock_least_square >>> import numpy as np
The
nonlinear_least_square()function is chosen to minimize the function.>>> from oppy.leastSquares.nonlinear_least_square import \ ... nonlinear_least_square
Additionaly, a starting value has to be chosen.
>>> x0 = np.array([6,3])
Now we can execute the
nonlinear_least_square()function:>>> result = nonlinear_least_square(rosenbrock_least_square,\ ... gradient_rosenbrock_least_square, x0)) >>> print(result) iterations: 2 res: array([7.94844897e+04, 1.11803399e+04, 2.24792062e-13]) x: array([1., 1.])
References
J. Nocedal, S.J. Wright. Numerical Optimization, see Nocedal and Wright [NW06] p. 245-257.
Jorge J. Moré. “The Levenberg-Marquardt algorithm: Implementation and theory”. see Moré [Mor78].
-
fhandle (
-
oppy.leastSquares.nonlinear_least_square.givens_qr(R_I, n, m) -
Calculate a QR decomposition of a special matrix.
This function calculates the \(QR\) decomposition of a given \((m+n, n)\) matrix, which is already upper triangular except for \(n\) elements. The funtion is used in the implementation of the Levenberg-Marquardt method as a trust-region method. More detailed, the input matrix looks like:
\[\begin{split}\left[ \begin{array}{c} R \\ 0 \\ \sqrt{\lambda}I \end{array} \right]\end{split}\]where \(R\) is an upper triangular matrix. The algorithm performs \((n(n+1))/2\) Givens rotations to get its QR decomposition.
Parameters: -
R_I (
numpy.ndarray, shape (m+n, n)) – The input matrix R_I is already upper triangular, except for the elements \((m+1, 1), (m+2, 2), ..., (m+n, n)\). -
n (
int) – Number of columns of the input matrix. -
m (
int) – Integer such that the number of rows of R_I is \(m+n\).
Returns: -
R_I (
numpy.ndarray, shape (m+n, n)) – Rotated matrix R_I, which is now an upper triangular matrix. -
Q.T (
numpy.ndarray, shape (m+n, m+n)) – This is the transpose of the product of all rotations, which were used to rotate the input matrix R_I.
-
R_I (
-
oppy.leastSquares.nonlinear_least_square.givens_rotation(A, i, j) -
Calculate one Givens rotation.
This function calculates one Givens rotation, such that a given matrix A is rotated. As a consequence, the entry A[i, j] is then zero.
Parameters: -
A (
numpy.ndarray, shape (m+n, n)) – Input matrix which is rotated. -
i (
int) – Row number of the entry which will be zero. -
j (
int) – Column number of the entry which is after the rotation zero.
Returns: -
B (
numpy.ndarray, shape (m+n, n)) – Givens rotated matrix B. -
Q (
numpy.ndarray, shape (m+n, m+n)) – Givens rotation matrix, it is \(Q \cdot A = B\).
-
A (