next up previous
Next: Conclusions Up: Multiple linear regression as Previous: Data interpolation

Further applications

The MLR method can not only be used to interpolate missing data, but also to infer missing parameters, to calculate temporal changes and to get consistent data on a regular grid.

If certain parameters were not measured on some stations, but were measured on surrounding stations the MLR allows to calculate the missing parameters. As example total inorganic carbon on WOCE A9 was only measured every second or third station, but with the MLR we are able to calculate the T-C content for every bottle. The surrounding stations even do not have to be from the same cruise, we can use historic data as long as we can assume that the coefficient have not changed with time. For most parameters time invariability of the coefficient can be expected because most of them are related to the redfield ratio which is normally assumed to be constant in time and space or to the water mass characteristics, which in the multi parameter water mass analysis (see Klein and Siedler (1995); Tomczak, (1981); Maamaatuaiahutapu et al. (1992)) are normally also assumed to be constant. In such a way we could calculate PO$_4$ values for the WOCE A8 section (the measurements didn't work properly) using data from the Oceanus cruise 133. As sayed before care has to be taken in data consistency, if oxygen data from one cruise has a offset off 2 umol/kg, an inferred NO$_3$ value has most probable an offset of 2 * 16/135 = 0.24 $\mu$mol/kg, where 16/135 is the Redfield Ratio of NO$_3$ to O$_2$.

If the missing parameter is not in steady state, for example all anthropogenic tracers and even T-C due to the anthropogenic influence, we could try to include time as a parameter in the MLR. But a probably more interesting possibility is to directly determine the increase between two cruises. If we have two stations at almost the same spatial position but at different times we can directly subtract the measured concentrations. But the measured change at a fixed position is a combination of a change in water mass composition and temporal change within a water mass (see Holfort et al. (2000)). At a position in the western boundary region of the South Atlantic at 1500 dbar, at time T1 we are within the North Atlantic Deep Water (NADW) core of the deep western boundary current. At time T2 the current core as moved westward and we are within a region of upper circumpolar water (uCPW). A measured change in CFC-11 at this position then is most probably due to this change in water mass composition and not due to a temporal change in water mass characteristics of NADW of uCPW. With the MLR we use the coefficients calculated from one cruise to interpolate the respective tracers onto the other cruise. Because of the close relationship to the multi parameter water analysis we are really mapping the one water mass of the first cruise onto the same water mass of the second cruise and therefore exclude effects of spatial variability. As we are using the MLR coefficients from time T1 of the first cruise, the interpolated values are the values for time T1 and the difference to the measured values at T2 gives us the temporal change of the respective tracer within the water mass.

Figure 5: The temporal change in T-C (in $/micro$mol/kg) from 1988/89 to 1991 as calculated using MLR. Crosses denote individual data points, the line with error bars gives the mean and standard deviation of individual depth intervals.
\includegraphics[width=7cm]{TCzeit.eps}

Figure 5 gives the T-C change between SAVE (1988) and WOCE A10 (1994) at 30$^oS$ in the eastern basin using MLR. In the surface waters, assuming equilibrium with the atmosphere, the expected increase in T-C for an increase in atmospheric CO$_2$ from 345ppm in 1988 to 357ppm in 1994 is about 7 $\mu$mol/kg. This is also about the mean value of the T-C changes found with the MLR in the surface waters. We also find some increase in the depth range of the North Atlantic Deep Water/upper circumpolar Water (1000-2000dbar) and also a slight increase towards the bottom, probably associated with Antarctic Bottom Water.

Numerical models which include some chemistry or biology (e.g. climate models including parts of the carbon cycle) need at least some consistent start point for parameters such as temperature, salinity, nutrients, etc., normally on a regular grid. One such data set is the World Ocean Atlas (Conkright et al.,, 2001). As the griding was done separatly for the different parameters, this can lead to inconsistencies between the different parameters. If for example the data coverage for temperature is good enough to resolve a front, but sparciness in the nitrate data leads to a smooth transition instead of a front, then temperature-nitrate relations can occur in the gridded field, which were not observed in the data. A MLR approach using temperature, salinity and oxygen data and assuming that the gridded fields of these variables are consistent, should give consistent grid values also in other parameters. In such a way I calculated mean fields of silicate, nitrate, T-C, alkalinity, etc. using bottle data from the World Ocean Experiment (WOCE, WOCE, (2000)) and the 1 degree fields of the WOA (Conkright et al.,, 2001). A quality check was performed onto the WOCE data set, eliminating some apparent errors and converting all data to the same units. The results (Figure 6 of this first calculation compares well with the available WOA fields, although the WOA fields are smoother then the MLR fields. Large deviations or, in the case of parameters not available in the WOA, suspicious extreme values can be found in regions with low data coverage. The results are therefore still not totally satisfactory, but mostly because of the input data and not due to the MLR method used. Although representing an enormous effort, data coverage of WOCE in some regions, specially taking not so commonly measured parameters like alkalinity, is sparse till not existent (e.g. Polar Ocean, some regions near the coast, etc.). Often also only data from a single cruise is used for the interpolation, therefore representing not so much a time mean but more a single snapshot, this can explain in some part, why the field is not as smooth as the original WOA field. Including more historic data these problems can be minimized, but the task of DQE, excluding wrong measurements, correcting for offsets, etc. is not a small one. Although the chosen proximity was not very large (2^ by 2^ and 253m+25%), to get an almost complete gridded data set,the multiplication factor was large, so that due to the sparciness of data in some regions a quite large proximity was used. This leads sometimes to larger errors because data was extrapolated from regions far away. In addition, although the used program has the option to include barriers this option was not used in this calculation. Therefore interpolation across land barriers like Central America was possible, meaning that data from the Carribian could be used to infer a data point in the Pacific Ocean. So still some work has to be done to arrive to consistent gridded fields, here I just wanted to show, that the MLR method can be used for this purpose.

Figure 6: Silicate at the surface in the Indian Ocean from the World Ocean Atlas (a) and interpolated using the WOCE dataset (b). Figure c shows the interpolated alkalinity field at the surface and figure d a section across the Atlantic at 329.5E.
\includegraphics[width=7cm]{WOAsio4.eps} \includegraphics[width=7cm]{WOAsio4int.eps} \includegraphics[width=7cm]{WOAalka.eps} \includegraphics[width=7cm]{WOAsio4atl.eps}


next up previous
Next: Conclusions Up: Multiple linear regression as Previous: Data interpolation
Juergen Holfort 2004-08-12