Macroinvertebrate Classification Diagnostic Tool Development
August 2007

This project (WFD60) forms part of the UK Strategy for the implementation of the EC Water Framework Directive (WFD: European Union, 2000). Within its broad remit the WFD requires the development of ecological classification tools for the purpose of determining ecological status, with reference to specific environmental pressures. The WFD requires that these tools should assign lakes to one of five categories, (High, Good, Moderate, Poor, Bad) to indicate conditions relative to what is considered to be “good status”. This report focuses on the development of a tool with which to determine the extent of the pressure of acidification on lake macroinvertebrate communities.

Objectives of research

The primary objective is the development of a method and tool with which to assess the pressure of acidification (a major threat to the ecology of acid-sensitive fresh waters, particularly in the UK uplands) on the benthic macroinvertebrate assemblage of lakes.

Key findings and recommendations

Tool development under WFD60 was severely delayed due to problems obtaining sufficient high quality biological and chemical data. The dataset used to support this phase is still less than satisfactory, comprising data for only 105 sites and representing a subset only of the chemical variables that would have been useful for explanatory data analysis. Due to the paucity of acid anion data from one source and dual endpoint (or Gran) alkalinity from another, the final physico-chemical dataset was built using one of two commonly used expressions of acid neutralising capacity (ANC) and a few associated determinands.   

Our assessment of the literature regarding macroinvertebrate-acidification inference techniques concluded that none were appropriate for this assignment. In most cases macroinvertebrate communities have been used to infer pH, but pH per se carries little information on acid sensitivity or the likelihood that a site has acidified.

We show, through an investigation of the output of the Steady State Water Chemistry  (SSWC) Model and palaeoecological diatom-pH reconstructions, how ANC can be used as an indicator of damage, in terms of modelled ANC change, diatom-inferred pH change and the mobilisation of labile inorganic aluminium (Allab) concentration.

Furthermore, we show that prediction of the likelihood and level of acidification can be refined by using ANC in conjunction with calcium concentration.

Assessment of chemical data from the UK Acid Waters Monitoring Network demonstrates that Allab concentration, possibly the most important agent of damage associated with acidification, will rarely if ever reach biologically toxic concentrations in sites with an ANC above 40 eq l-1.

Conversely, sites which currently have a negative ANC are highly likely to exhibit biologically toxic Allab concentrations.

We show that ANC and Allab explain as much variance in a small high quality macroinvertebrate dataset as pH and propose that macroinvertebrate community structure may carry sufficient information for the level of physico-chemical damage to be inferred through its relationship with ANC and calcium concentration.

In the expanded dataset, representing 105 lakes, we again show that ANC is strongly related to the principal axis of macroinvertebrate species variation between sites.

We show that certain attributes of macroinvertebrate community structure pertinent to normative definitions also vary along an ANC gradient. In particular, a crude measure of macroinvertebrate species richness, as inferred by the total number of species identifiable to species level, is  tightly related to ANC. This is consistent with observations in the literature that macroinvertebrate diversity may be reduced by anthropogenic acidification but not by natural acidity (i.e. at sites where pH is depressed by organic acids only). Several individual species show sharply truncated distributions on Allab gradients and species often ceased to be present in waters with mean annual Allab concentrations over 10 g l-1.

We created a “damage matrix” to provide an a priori physico-chemical classification of all sites in the WFD60 database by ANC and calcium concentration into WFD compliant classes, i.e. HIGH, GOOD, MODERATE, POOR, BAD.  Owing to the sparsity of the data we then condensed these classes into three representing HIGH-GOOD, MODERATE and POOR-BAD.

We used a classification tree approach to predict the a priori defined class of each site using its macroinvertebrate assemblage. Classification trees are a powerful yet simple way of predicting classes from a set of predictor variables (in this case, macroinvertebrate species and broader macroinvertebrate groups).

After using a large range of biological input variables, including data at species level (i.e. the proportions of individual taxa) we found that summary data only, in the form of minimum species richness (MSR) of the full assemblage, the minimum number of species in certain biological groups, and the proportion of individuals represented by certain groups, was necessary to maximise the successful classification rate.  The final tree classification used these variables only.

We found that a simple rule, i.e. MSR >or<12.5, provided the most powerful criterion for distinguishing between damage classes at the primary level. Further splits were based on the number of non-leptophebid (i.e. mostly acid-sensitive) Mayfly taxa, the presence/absence of bivalves, the proportion of Ephemeropteran, Plecopteran and Trichopteran individuals in the entire assemblage, and the minimum number of stonefly taxa. The apparent misclassification rate of this tree was 18.3%. We determined that the tree should be able to correctly assign class status to random independent samples between 77 – 78% of the time.

This simple approach was able to distinguish between acidified and naturally acid (i.e. high DOC, low sulphur deposition) lakes that tend to support relatively large numbers of taxa. Apparently more complex, species-based, models such as the Acid Water Indicator Community model (AWIC) are perhaps better tuned to predict pH but have limited value in this sense.
While the divisions on this tree form our current “best” model, we have major reservations with respect to the total number of sites in the dataset and the distribution of sites at the acidified end of the gradient. The model as it stands is clearly not fit for purpose but would benefit greatly from the addition of 30-40 more sites in an acidified condition.

While this is a categoric approach to classification, class predictions could be converted to EQR-compatible site scores to meet WFD requirements. There are a number of methods to achieve this, but the most robust would use a method known as “bagging” to determine the probability of membership of each site in the most likely class and neighbouring classes, to provide a sliding score. The proposed increased number of sites would be essential for this technique to be used effectively.

We tested the tool qualitatively on 51 sites for which chemical data were not adequate to be included in the original training set. Generally the classification of sites was highly consistent with geographical location although a few sites were clearly misclassified.

Current model weaknesses are likely to be principally due to the paucity of sites for which data are available at the acid and acidified end of the physico-chemical gradient. The imbalance of sites in the training set also prevents us from deriving predictions of the probability of correct classification using a “tree bagging” technique.

We recommend that biological and physico-chemical data are gathered for a further 30-40 acidified sites before any attempt is made to refine the existing model.

Before implementation, we recommend the tool is tested on 1) time series data, to allow an assessment of temporal variability of output, and 2) sites for which detailed multi-proxy biological records are available, so that the macro-invertebrate inferred damage class can be related to wider-ecosystem indications of damage by acidification.

Keywords: Water Framework Directive, Lakes, Acidification, Littoral Macroinvertebrates, Classification, Classification Trees, Acid Neutralising Capacity, Aluminium, pH.

Copies of this report are available from the Foundation, in electronic format on CDRom at 20.00 + VAT or hard copy at 25.00, less 20% to FWR members.
N.B. The report is available for download from the SNIFFER Website