What does it all mean?

Just over a quarter of a century ago, my friend and colleague Steve Juggins and a group of other palaeoecologists came up with a clever way to relate the composition of diatom samples taken from different levels of a sediment core to the environmental conditions of the lake at the time that these diatoms were alive.   At the heart of this was a set of statistical tools called “transfer functions” and the use of these has proliferated over subsequent years, spilling from diatoms to many other groups of organisms and from palaeoecological studies to contemporary investigations of man’s impact on the environment.   So pervasive have these methods become that Steve returned to the subject a few years ago and critiqued the many misuses of the method that he was seeing in the literature.

The principle behind the use of transfer functions is that each species has a characteristic response to an environmental pressure gradient (in early studies this was pH) which could be portrayed as a unimodal (approximately bell-shaped curve).   The point along the gradient where a species is most abundant represents the “optimum” condition, the level of the pressure where the species thrives best.  The average of the optima of all organisms in a sample, Steve and colleagues showed, could be then used to estimate the value of the pressure.   This unlocked the door to quantitative reconstructions of changes in acidification of lakes in the UK and Scandinavia that, in turn, ultimately shaped environmental policy. It was one of the most impressive achievements of applied ecologists in the 20th century.

A diagrammatic representation of the principle behind transfer functions: each organism has a characteristic response to the predominant pressure (nutrient/organic pollution in this case).

Part of the reason for their success in building strong predictive models was, I suspect, that the pollutant that they were focussed upon had a direct effect on the physiology of the cells which, in turn, created strong selective pressures on the community.   Another reason was that palaeoecological samples condense all the habitat variation within a lake (plankton v benthic, seasonal differences etc) into a single assemblage.   This, then, begs the question of how well we should expect transfer functions to perform when applied to assemblages which represent much narrower windows of space and time, and when the pollutants of interest exert indirect rather than direct effects on the organisms.   Or, to recast that question another way, are some of the problems we encounter interpreting diatom indices from rivers another form of the misuse of transfer functions that Steve dissects in his review?

It is easy to believe that transfer functions do work when applied to contemporary diatom assemblages from rivers.   If you evaluate datasets you will almost certainly find that the “optima” for all the species do appear to be arranged along a continuum along the pressure gradient.  The question that we need to ask is whether this represents a causal relationship or is just a statistical artefact?  I touched on this issue in “What we expect is often what we get …” but, in that post, I was mostly interested in how samples react along a gradient, not the response of individual species.  I suspect that, given the importance of alkalinity in freshwater algal ecology (see “Ecology in the Hard Rock Café”), this must influence the distribution of optima along a nutrient gradient.   This will be compounded when sample sizes are small, as the likelihood is that the sample optimum will not correspond exactly to the “true” optimum for the species in question (a question Steve has also addressed in a more recent paper – see reference list below).  Finally, this is all embedded within a larger problem: that most of the work I have discussed here involves statistical inference from datasets compiled from samples collected from a range of sites in a region, but is intended to address changes in time rather than space (so-called “space-for-time substitution – see reference by Pickett below).   There has been relatively little testing of species preferences under controlled experimental conditions.

In practice, I suspect, the physiological response of benthic algae to nutrients is less complicated than our noisy graphs suggest.   I set out a version of this in “What we expect is often what we get …”.   That post dealt primarily with communities of microalgae; this is the same basic scheme (with some slight revisions) but posed in terms of the physiological response of the organisms.  It borrows from the habitat matrix conceptual model of Barry Biggs, Jan Stevenson and Rex Lowe (which, itself, builds on earlier work on terrestrial plants by Phil Grime and colleagues).

An alternative explanation for the response of benthic algae to nutrients and organic pollution.  a., b., c. and d. are explained in the text.

  1. Low nutrients / high oxygen concentrations – the “natural state” in most cases. Biggs et al. referred to species adapted to such conditions “stress-adapted” as they can cope in situations where nutrients are scarce. Associated with TDI scores 1 and 2.  Examples: Hannaea arcus, Achnanthidium minutissimum, Tabellaria flocculosa.
  2. high nutrients / no “secondary effects” of eutrophication – these are “competitive” species in Biggs et al.’s template and can thrive when there is anthropogenic enrichment of nutrients. Ideally, this group would consist of species that have a physiological adaptation that allows them to thrive when nutrients are plentiful though, in practice, our understanding is based mostly on inference from spatial patterns. The “window” where such species can thrive is wide, and will overlap with the two states described below, in many cases.  Associated with TDI scores 3 and 4.  Examples: Amphora pediculus, Rhoicosphenia abbreviata, Cocconeis pediculus.  Cladophora glomerata would be a good example of a non-diatom that belongs to this group.
  3. high nutrients plus “secondary effects” of eutrophication – this category extends the habitat template of Biggs et al. to include organisms whose are reacting to secondary effects  of nutrient enrichment (e.g. shade and low oxygen) rather than to the elevated nutrients per se and is, consequently, difficult to differentiate from a direct response to organic pollution. Associated with TDI scores 4 and 5. Examples include several species of Nitzschia as well as Mayamaea and Fistulifera, amongst others.   Importantly, this group may co-exist with representatives from group b. – perhaps inhabiting different zones of the biofilm that typically blend together when a sample is taken.
  4. high nutrients / very low oxygen – a final category that represents extreme situations when an ability to cope with reducing conditions is beneficial, and where diatoms that are facultative heterotrophs may thrive. Associated with TDI score 5. Heterotrophic fungal and bacterial growths (“sewage fungus”) may also be abundant.  Once again, there is likely to be some overlap between this and other groups.   Technically, this group is more likely to be associated with serious organic pollution than with nutrients; however, it will be found at sites where nutrient concentrations are high and it is possible that an association with nutrients may be inferred from spatial patterns.

We are left, in other words, with a choice between deriving optima along a continuous scale based on inferences from spatial patterns within which we know that there are significant confounding variables or dividing species into a few physiologically-defined categories for which there is not very much experimental underpinning.   Neither is ideal, and some of our recent analyses suggest that, in terms of model strength, there is little to choose between them.   The former, in my view, suggests an artificially high level of precision that is unrealistic, given the current state of knowledge.   The latter, on the other hand, links the data to a conceptual model rather than simply relying upon the numbers that squirt out at the far end of a statistical process.

That does not mean that such an approach might not be appropriate for some other groups of organisms.  The reason why I urge simplicity for diatoms is largely because of the scale of the habitats that we are sampling, in relation to the wider patterns of variability.  A continuous series of optima may be appropriate in some cases too.   Macrophytes surveys, for example, encompass all visible organisms found along a 100 m stretch.   These will have a range of life history and nutrient acquisition strategies: some of these will take up nutrients from the water, some from the sediments.  Different types of sediment will vary in the supply of phosphorus and nitrogen, and so on.   There will still be issues of confounding variables and risks of inferring from correlative rather than causal relationships, but perhaps the overall patchiness experienced over the survey length will create a more complex web of interactions between nutrients and community that justifies a continuous scale.

For diatoms, however, simplicity is probably the best choice.   In the absence of definitive evidence one way or the other we apply Occam’s Razor (“entities should not be multiplied unnecessarily”) and opt for the simpler of the two hypotheses pending evidence to the contrary.   This, in turn, may address a deeper issue: that of finding robust answers to complex problems (see “Unravelling causal thickets …”).   Inference from statistical models is only as good as the conceptual models that underpin those models and, I fear, we too often are so lost in the detail of the many confounding variables that we lose sight of our goals.  Being able to understand our observations in terms of ecological process is the first step to finding robust solutions to our problems.


Bennion, H., Juggins, S. & Anderson, N.J. (1996).  Predicting epilimnetic phosphorus concentrations using an improved diatom-based transfer function and its application to lake eutrophication management. Environmental Science & Technology 30: 2004-2007.

Biggs, B.J.F., Stevenson, R.J. & Lowe, R.L. (1991). A habitat matrix conceptual model for stream periphyton. Archiv für Hydrobiologie 143: 21-56.

Birks, H.J.B.,  Line, J.M., Juggins, S., Stevenson, A.C. & ter Braak, C.J.F.  (1990). Lake surface-water chemistry reconstructions from palaeolimnological data. Diatoms and pH reconstruction. Philosophical Transactions of the Royal Society of London Series B 327: 263-278.

Juggins, S. (2013).  Quantitative reconstructions in palaeolimnology: new paradigm or sick science?  Quaternary Science Reviews 64: 20-32.

Kelly, M.G., King, L. & Ní Chatháin, B. (2009).  The conceptual basis of ecological status assessments using diatoms.  Biology and Environment: Proceedings of the Royal Irish Academy 109B: 175-189.

Pickett, S.T.A. (1988).  Space-for-time substitution as an alternative to long-term studies.  Pp. 110-135.   In: Long-term Studies in Ecology: Approaches and Alternatives (edited by G.E.. Likens).  Springer-Verlag, New York.

Reavie, E.D. & Juggins, S. (2011).  Exploration of sample size and diatom-based indicator performance in three North American phosphorus training sets.  Aquatic Ecology 45: 529-538.

Who needs a “red list” anyway?

The previous two posts suggested that it might be possible to construct a provisional red list of freshwater diatoms, albeit with several caveats.   The question that still needs to be answered is whether there is any real benefit to such an exercise.

I think we can say with some confidence that a red list of freshwater diatoms will not precipitate a crisis of conscience amongst the national conservation bodies or wildlife trusts, there will be no rush to draw up plans to add rare diatoms to Biodiversity Action Plans and no Sites of Special Scientific Interest will be designated because of the unique diatoms found there.   So why bother?

One problem that I identified in the first post in this series (see “A red list of endangered British diatoms?”) is that those of us who have been studying algae have never been part of the widespread tradition of wildlife recording that takes place around Britain and which is the basis for the red lists of many other groups of plants.   We make our own lists, for sure, but there is no centralised system for either recording or validating records.   This activity has, for many groups, been the preserve of enthusiastic amateurs and, whilst there are amateur phycologists, numbers are well below that required to develop meaningful distribution maps. At present, for many freshwater algae, the distribution maps are more likely to show you where the small number of collectors are most active, rather than offer any profound insights into biogeography.   The freshwater diatoms are an exception here, as I hope I have shown, albeit with several caveats.

The benefits of better recording are twofold.   The first is simply to raise the profile of algae amongst the conservation movements.   I have already shown that algae represent a large part of UK’s total biodiversity (see “The sum of things …”).   In so doing, I added myself to the long list of phycological whingers and windbags who vent their spleens at the way that algae are invariably overlooked by conservationists.   If we want to be taken seriously, we need to start producing evidence of a quality equivalent to that for other groups of organisms.   Distribution maps are a step in that direction.   They are possible not just for some freshwater diatoms but also for some other types of freshwater algae (see ““Looking” is not the same as “seeing”” for an example).   My hope is that production of a preliminary list might, itself, flush out further records and generate a dialogue amongst phycologists and beyond.

The second point is that systematic recording of distributions will, itself, throw up some testable hypotheses. I suggested, in the previous post, that Gomphonema tergestinum might have a restricted distribution that cannot be explained solely by chemical conditions. I’ll return to that in a future post but I suspect that there may be others that also show unexpected patterns. In other words, better recording might well lead to better insights into the ecology of these organisms. We may, indeed, have missed the boat on this topic: the distribution patterns of many species have already been shown to change in response to climate warming (see below for a reference to one example).   Last year I wrote about Hydrurus foetidus, a chrysophyte that I found growing in high altitude streams in Norway (see “A brief excursion to Norway”).   I know that it has been recorded in this country but I have never seen it here. It would be interesting to look at locations where it has been found in the past and see if there is any evidence for it growing now and, indeed, whether there have been any shifts in its distribution patterns.

And my final point is even more basic.   There is already a provisional atlas of the slime moulds of Britain and Ireland. If they can do it for slime moulds, surely we can do it for freshwater algae too. Our professional pride is at stake …


Fox, R., Oliver, T.H., Harrower, C., Parsons, M.S., Thomas, C.D. & Roy, D.B. (2014.) Long-term changes in the distribution of British moths consistent with opposing and synergistic effects of climate and land use change. Journal of Applied Ecology, 51, 949-957.


A “red list” of endangered British diatoms?

I have had two conversations about rare algae over the past two weeks. The first was an invitation to contribute to an exercise to develop a list of diatoms that might form the basis of a “red list” of endangered algae. The second was a retort from a colleague that such an act would be meaningless as algae don’t have the same biogeographical restrictions on their distributions as higher organisms, and that all algae will grow anywhere so long as the environment is suitable.   The argument that algae don’t have biogeographical restrictions is an old one (summarised as “everything is everywhere, the environment selects”) but several recent papers have shown this to be wrong. Some species do appear to be cosmopolitan, as my previous post shows, but others do seem to be restricted to particular regions. Even if the local environment does play a large role in determining the algae that are found at a location, that does not seem to obviate the need for a list of endangered algae. On the contrary, it might even help focus attention on locations where efforts to restore a site might make a real difference.

The problem, in my opinion, is more basic: phycologists working in freshwaters do not have a strong tradition of systematic recording of the distribution of organisms. You only need to look at the Freshwater Algal Flora of the British Isles, and to see how many species are described as “probably cosmopolitan”, to realise the scale of the problem.

Because of their widespread use in ecological assessment, the diatoms are one group of freshwater algae where there may be enough data to start making some sensible judgements about the rarity, or otherwise, of individual species. I had a look at a dataset compiled for a project that I was involved with a few years ago in order to see what might be possible. This dataset comprises 6500 samples from 3305 sites spread across Great Britain and Ireland, most of which also have location information. The basis for conservation assessments is the distribution in 10 km squares, termed “hectads”, of which 1111 were represented in my dataset, out of a total of about 3000 in Britain.   The two criteria I am using are “nationally rare” for species that occur in 15 hectads or fewer, and “nationally scarce”, for those which are only found in 16 to 100 hectads. Using these criteria, I produced a “long list” of 150 diatom taxa that are “nationally scarce” and a further 226 which may be “nationally rare”. This, however, is where the real work starts.

Scanning down this list, I see several problems that need to be addressed before we can make serious judgements about the rarity, or otherwise, of particular taxa. However, I do also see a number of taxa on this list that I do believe to be genuinely rare or scarce and which are, at least, worthy of more study.   The problems are many and will spill into the next post but let’s make a start:

  1. The dataset I’m using is for rivers, and I will need to merge it with some additional datasets to get good coverage of lakes and also of soft water and acid habitats.   I would not trust this provisional analysis to give an accurate overview of the distribution of acid-loving Eunotia species, for example, nor of planktonic diatoms such as Asterionella formosa;
  2. I also noticed some species typical of brackish taxa which have been recorded occasionally in freshwaters (e.g. Bacillaria paxillifer). More comprehensive coverage of coastal and estuarine environments would probably show many of these to be quite common.   The same reasoning applies to those diatom species associated with terrestrial habitats (e.g. Hantzschia amphioxys).
  3. Most of the samples in our database come from rock or plant surfaces and it is likely that diatoms that prefer other substrata have been under-recorded, which will complicate interpretation of their distribution. Absence of evidence is not evidence of absence.
  4. Many of the taxa that are rarely recorded belong to taxa that have been subject to taxonomic uncertainty over the past few decades, leading to variations in how they have been recorded. Some of the rare diatoms are “varieties” of common species but as these often are (or were) poorly described in the literature, many analysts have not tried to distinguish them.
  5. Finally, we have to be sure that the records actually represent living populations. Because diatomists usually work from the empty silica frustules, we cannot tell whether a cell was alive at the time it was collected.   If you find a number of frustules of the same species, then it is reasonable to assume that some of these were alive, but if a species is represented by a single frustule, we have to consider the possibility that this was washed into the site from elsewhere, and never actually grew there.

The positives from this process are that I think we can start to make some judgements about the rarity (or otherwise) of diatoms that are reasonably well circumscribed in the literature (i.e. a low chance of misidentification) and where the underlying taxonomy has been relatively stable. A further criterion at this stage is that the candidate taxa must be common in streams and rivers and, ideally, associated with hard surfaces rather than soft sediments.   That’s quite a lot of caveats, but in the next post I’ll start to sort through the list and see if we have some genuine candidates for “scarce” or “rare” diatoms.


The dataset referred to was developed for:

Kelly, M.G., Juggins, S., Guthrie, R., Pritchard, S., Jamieson, J., Rippey, B, Hirst, H. and Yallop, M (2008). Assessment of ecological status in U.K. rivers using diatoms. Freshwater Biology 53: 403-422.