next up previous
Next: Data interpolation Up: Multiple linear regression as Previous: Introduction


The method of multiple linear regression (MLR) is based on the assumption that every measured parameter $M_0$ can be expressed in the form of linear combination of other parameter $M_1$,$M_2$,... $M_n$ in the form:

\begin{displaymath}M_0=a_0 + a_1 M_1 + a_2 M_2 +...+ a_n M_n \end{displaymath}

Using MLR the unknown a's in such a formula can be calculated from at least n different sets of measurements of $M_1$ to $M_n$.

This assumption can easily be related to the mixing of water masses. Lets take 3 water masses $W_1$,$W_2$ and $W_3$ each with its characteristic parameter value $W_{10}$,$W_{20}$,$W_{30}$ for parameter $M_0$; $W_{11}$,$W_{21}$,$W_{31}$ for parameter $M_1$; etc.. The result of mixing these water masses is expressed in following form:

\begin{displaymath}M_0= p_1*W_{10} + p_2*W_{20} + p_3*W_{30} \end{displaymath}

\begin{displaymath}M_1= p_1*W_{11} + p_2*W_{21} + p_3*W_{31} \end{displaymath}

\begin{displaymath}M_2= p_1*W_{12} + p_2*W_{22} + p_3*W_{32} \end{displaymath}

\begin{displaymath}1 = p_1 + p_2 + p_3 \end{displaymath}

with $p_1$,$p_2$,$p_3$ being the percentage of the respectively watermasses 1,2 and 3 in the mixture, and the last equation expressing mass conservation. If the water mass percentages $p_x$ and the parameter $M_0$ are not known, we can calculate $M_0$ using the last 3 formulas arriving at:

\begin{displaymath}M_0 = l_0 + l_1*M_1 + l_2*M_2 \end{displaymath}

where $l_0$,$l_1$ and $l_2$ are lengthy formulas involving only the $W_{xx}$'s. If also the $W_{xx}$'s are not known it is possible to use other measured bottle data; bottle $B_1$ with measured parameters $M_{01}$,$M_{11}$,$M_{21}$; bottle $B_2$ with $M_{02}$,$M_{12}$,$M_{22}$; etc., to determine the unknown l's. A minimum of 3 different bottles are then necessary to calculate $M_{00}$ (the unknown $M_0$ from bottle 0) and a minimum of 9 would be necessary to calculate all the $W_{xx}$'s. If the mixing implies more watermasses we always need at least the same amount of parameters M and bottles B as there are different water masses.

If the parameter $M_0$ is not only passively mixed but has also sources and sinks which are related linearly to a parameter $M_x$ (like the nitrate source is related linearly to the apparent oxygen utilization by the redfield ratio) we can just add this term to the formula for $M_0$:

\begin{displaymath}M_0 = l_0 + l_1*M_1 + l_2*M_2 + k*M_x \end{displaymath}

and use the same MLR to calculate $M_{00}$.

Other processes that can have an effect on $M_0$ are not correlated in a linear way to some parameter $M_x$, as there are radioactive decay, solubility, etc., or the redfield ratio from the last formula is not constant, which can result in a locally varying factor k. But if only deviations from a fixed reference are considered, we can linearize these processes within a small error margin and once again use the MLR. One way to consider only small deviations is to use only bottles in the proximity of bottle 0 where $M_{00}$ is to be calculated. A positive side-effect of this more local approach is that fewer watermasses have to be considered in the calculation.

In the following we are giving an incomplete list of parameter correlations, not necessary linear ones, which are not due to mixing or due to the equation of state.

next up previous
Next: Data interpolation Up: Multiple linear regression as Previous: Introduction
Juergen Holfort 2004-08-12