BACKGROUND TO LEAST SQUARES ADJUSTMENTS |
There are many textbooks presenting the principles of Least Squares estimation
as they apply to surveying and geodesy. Two that are particularly useful
at an introductory level are CROSS
(1983) and HARVEY (1994).
The following is a brief summary of the basic methodology.
A Least Squares adjustment involves two models:
The functional model relating the measurements and the parameters. The most common approach is to use observation equations of the general form l = (x). To satisfy this relation, actual observations need to be corrected or "adjusted". The linearisation of the relation is performed about an approximate set of values for the parameters to be estimated :
(7.1-1) |
The expression in brackets is the "observed minus computed" term, or approximate residual, and is denoted by . I and l are the true and actual observations respectively, x and are the true and approximate (or apriori) parameters respectively. are the corrections to the approximate parameters and A is the design matrix containing the partial derivatives of the observations with respect to the parameters.
The stochastic model
describing the statistics of the measurement. This is in the form of the
weight matrix P, or its inverse the covariance matrix of the observations.
For example, all measurements could be independent (that is, a diagonal
weight matrix) and have the same standard deviation.
Condition Method
Condition equations express properties that the observations should satisfy. The general form of a condition equation is (l) = 0, where l is the vector of true observations. Actual observations l are generally biased by a number of errors and therefore do not satisfy this condition. A vector of miscloses can be computed as (l ) = w. The adjustment aims at computing the vector of corrections v to the observations such that the corrected observations satisfy both the relation (l - v) = 0 and the Least Squares condition v^{T}Pv ---> minimum. The linearisation of the condition equation is based on a Taylor's series expansion of the first order:
(l - v) = (l) - Bv | (7.1-2) |
where B is the design matrix, containing partial derivatives of (l) with respect to l about the actual observations l . The variance-covariance (VCV) matrix Q_{l I} of the observations is assumed known. The computational procedure can be summarised in a few steps:
Linearised form: | (7.1-3) |
Solution for the residuals: | (7.1-4) |
VCV matrix of the residuals: | (7.1-5) |
VCV of the adjusted observations: | (7.1-6) |
The advantage of the condition method is that the unknown terms are simply
corrections to the observations. The size of the matrix B Q_{l
l} B^{T} to invert is r x r, where r is the number
of conditions, which is in fact equal to the number of redundant observations.
However, there are a number of drawbacks. The derivation of adjusted values
for functions of the observables (for example, the coordinates) is tedious,
as is the derivation of their respective precisions and correlations. Furthermore
the construction of eqn (7.1-2) requires a sound geometrical understanding
of the situation, as only independent conditions must be used. Consequently,
the setting-up of the equations is not easily automated for a computer and
this method is not used for GPS adjustments. However this method was especially
popular in the past for geodetic network adjustments when computers were
not available.
Parametric Method
This method of adjustment makes use of observation equations, where observables are expressed as a function of some or all of the parameters, in the general form l = (x). To satisfy this relation, actual observations need to be corrected or "adjusted". The linearisation of the relation is performed about an approximate set of parameters :
(7.1-7) |
The expression in brackets is the approximate residual, and is denoted by
.
The variance-covariance matrix Q_{l l}
of the observations is assumed known. As differs from l only by a
constant, it has the same stochastic behaviour. The computational procedure
therefore is:
Linearised form: | (7.1-8) |
Solution for the parameters: | (7.1-9) |
with VCV matrix: | (7.1-10) |
The adjusted observation residuals can be computed in two different ways:
(7.1-11) |
(7.1-12) |
The second method clearly illustrates the relation between the approximate and adjusted parameters. Indeed, this is the main justification for the choice of the unusual symbol to denote the vector of approximate residuals, hence ensuring complete consistency between quantities related either to x or v. The VCV matrix of the residuals is easily derived from the indirect computation of the residuals, assuming the stochastic independence of the measurements and the approximate parameter vector:
(7.1-13) |
The aposteriori variance factor is:
where n is the number of observations and u is the number of parameters.
Although it is not strictly correct, in these notes there will be no distinction made between co-factor matrices and variance-covariance matrices. The variance factor scales the cofactor matrix to give the VCV matrix. If the VF is close to unity (as it should be), then the co-factor matrix and the VCV matrix are almost identical. |
If the desired results (for example, the coordinates) are selected as the
parameters, the solution of the system leads directly to the answer. There
is exactly one equation per observation, and its form can easily be defined
according to the type of observation. The size of the matrix to invert is
u x u, where u is the number of (unknown) parameters, and the setting-up
of the equations can therefore be automated in a computer program. The linearisation
of the problem requires some apriori approximate knowledge of the parameters,
which is usually available. For GPS adjustments, it is generally no problem
to obtain a converged solution of the non-linear problem (eqn (7.1-19))
through an iterative process.
Relations Between Both Approaches
If any problem can reduce to either the condition or parametric case, there is actually a choice between the two methods of setting-up and solving the Least Squares problem. Given n linear observation equations with u (unknown) parameters, the elimination of all the parameters leads to a system of r = n - u condition equations. The reverse of this process is much harder because there is an almost infinite number of possible parameterisations. In some cases, for example if there are fewer redundant observations than unknown parameters, the condition method offers computational advantages. However, the dramatic improvement in the power of computers has made these advantages largely irrelevant and the adjustment by parameters is now the standard solution approach to almost all over-determined systems.
The Combined Case
It is also possible to formulate relations involving both observables and parameters. Functions of the observables are related to functions of the parameters, in the general form: (l, x) = 0. For example, this method is useful when solving for transformation parameters (HARVEY, 1994). A linear relation is obtained following the usual procedure:
(7.1-14) |
(7.1-15) |
Solution for the parameters: | (7.1-16) |
Variance-covariance matrix: | (7.1-17) |
Formulae for the residuals are given, for example, in CROSS (1983). To demonstrate the equivalence between these two approaches, it suffices to rearrange eqn (7.1-15) and consider particular design matrices:
Condition: | Bv = w + Ax | with A = 0 |
Parametric: | w - Bv = Ax | with B = I |
In the parametric case, the misclose vector w is equal to , the observation residual computed using the approximate parameters .
Bayesian Least Squares
In many cases, a fairly good apriori knowledge of the parameters is available. Thus, it is reasonable to require that the adjusted value of a parameter should not be too different from its apriori value. This condition can be imposed in a number of ways:
The resultant normal equations are (MERMINOD & RIZOS, 1988):
(7.1-18) |
The extension of the quadratic form to v^{T}Pv + x^{T}P_{}x represents a generalisation of the classical Least Squares method, where P_{} is the apriori weight matrix of the parameters. With P_{}= 0, that is, no apriori information on the parameters is available, then Bayesian Least Squares reduces to the classical definition.
Sequential Least Squares
The batch and sequential (or step-by-step) processing modes can be distinguished. The batch processing mode is the one most commonly encountered in standard GPS static data processing as well as in most geodetic adjustment problems. The Least Squares adjustment is carried out once all the data has been acquired. However, a sequential treatment of Least Squares problems may be preferable for several reasons:
It is therefore tempting to treat successive batches of measurements sequentially, though it should be kept in mind that the results of a sequential Least Squares adjustment will be identical to that of a batch solution once all the data has been processed. The formulae used in sequential Least Squares estimation are given in, for example, CROSS (1983).
Back To Chapter 7 Contents / Next Topic / Previous Topic
© Chris Rizos, SNAP-UNSW, 1999