Next: Data interpolation
Up: Multiple linear regression as
Previous: Introduction
The method of multiple linear regression (MLR) is based on the
assumption that every measured parameter can be expressed in
the form of linear combination of other parameter ,,...
in the form:
Using MLR the unknown a's in such a formula can be calculated from at
least n different sets of measurements of to .
This assumption can easily be related to the mixing of water masses.
Lets take 3 water masses , and each with its
characteristic parameter value
,, for parameter ;
,, for parameter ; etc..
The result of mixing these water masses is expressed in following form:
with ,, being the percentage of the respectively
watermasses 1,2 and 3 in the mixture, and the last equation
expressing mass conservation. If the water mass percentages and
the parameter are not known, we can calculate using the
last 3 formulas arriving at:
where , and are lengthy formulas involving only the
's. If also the 's are not known it is possible to
use other measured bottle data; bottle with measured parameters
,,; bottle with
,,; etc., to determine the unknown l's. A
minimum of 3 different bottles are then necessary to calculate
(the unknown from bottle 0) and a minimum of 9 would
be necessary to calculate all the 's. If the mixing implies
more watermasses we always need at least the same amount of
parameters M and bottles B as there are different water masses.
If the parameter is not only passively mixed but has also
sources and sinks which are related linearly to a parameter
(like the nitrate source is related linearly to the apparent oxygen
utilization by the redfield ratio) we can just add this term to the
formula for :
and
use the same MLR to calculate .
Other processes that can have an effect on are not correlated
in a linear way to some parameter , as there are radioactive
decay, solubility, etc., or the redfield ratio from the last formula
is not constant, which can result in a locally varying factor k. But
if only deviations from a fixed reference are considered, we can
linearize these processes within a small error margin and once again
use the MLR. One way to consider only small deviations is to use
only bottles in the proximity of bottle 0 where is to be
calculated. A positive side-effect of this more local approach is
that fewer watermasses have to be considered in the calculation.
In the following we are giving an incomplete list of parameter
correlations, not necessary linear ones, which are not due to mixing
or due to the equation of state.
- Nitrate (NO) - Phosphate (PO) - dissolved inorganic
carbon (T-C) - apparent
oxygen utilization (AOU) due to the constant redfield ration and the
processes of formation and remineralization of organic material
- potential temperature () with gases like CO,
oxygen, etc. due to the temperature dependency of the solubility
- in the Atlantic silica (SiO) correlates with the percentage
of water of southern (or pacific) origin .
- a pacific origin of water masses is again correlated with
higher alkalinity and terrigenic helium.
- apparent oxygen utilization is correlated with age if we can
assume a constant respiration ratio
- age again correlates with anthropogenic tracers like tritium,
freon or CCl
Next: Data interpolation
Up: Multiple linear regression as
Previous: Introduction
Juergen Holfort
2004-08-12