One small step for digital Palaeontology

The time of digital technology is upon us. No scientific domain is embracing it’s fast-paced and dynamic progression more so than Palaeontology. One such realm that is exploding with new studies and enrapturing the minds of people and the global media is the increasing possibility to digitise and manipulate three-dimensional fossils. Surface laser-scanning, C-T scanning and mechanical digitizers are all commonplace now in palaeontological studies. The implications of such techniques are far-reaching, from reconstructing robotic dinosaurs (see video), to understanding vertebrate biomechanics at an intricate level. Other palaeontologists digitally reconstruct the internal anatomy of various organisms; for example, in the Herefordshire deposits in the UK, digital models are recreated from exquisitely preserved fossils within nodules to look at the evolution of the internal structures  that were pivotal in the evolution of extant hyperdiverse invertebrate groups, such as arthropods.

It is pretty well established that the fossil record is fraught with completeness issues. I covered the problem of this in a previous post in terms of understanding biodiversity patterns in deep geological time, in the context of lineage completeness. Another problem however is individual specimen completeness. Several authors have attempted to compensate for this secondary level of ‘bias’, using various quantitative metrics, and use these to guide assessments of biodiversity through time in specific lineages (e.g., sauropod dinosaurs). Another problem is that often, fossils have been ‘squished’ and distorted by the weight of successive layers of rock over the thousands or millions of years they have been buried for. This is a problem which is typically found in dinosaur skulls, making them somewhat resemble Imhotep in The Mummy (this may be fictional).

Imhotep is pissed

Ugly beyond all reason, possibly as a result of post-mortem decay. "You talking to me?!"

Geometric morphometrics is something that I’ve mentioned in previous posts. It sounds awful, the  very mention of it usually enough to put people off or smash a keyboard upside your head. But thanks to several review papers, the basic concepts are now much easier to grasp and apply to a variety of scientific hypotheses. Statistics are quantitative, easy to record, less subjective than qualitative statements, and available for repeated manipulation through a wide variety of methods. The integration of geometry-based analysis is now commonplace in almost every aspect of Palaeontology, intimately coupled with an increase in the availability of digital techniques. The fact that you don’t have to damage unique specimens during the processes (usually) is a bonus too!

The latest analysis, and a critical study for palaeontologists and museum curators around the world, uses geometry-based reconstruction of a poorly-preserved fossil to digitally reconstruct missing or distorted parts. And the best part about it, is that it’s fully open access (including all supplementary videos); the comment that “this method does not require specialised software or artistic expertise” is perhaps a bit misleading, as you firstly need a fossil and a CT scanner (or a previous scan), a pretty beasty PC, and the software mentioned is hardly cheap (Rhinoceros is €195 for a student license, and for Geomagic the cheapest price I could find was $8000). The actual software used (Mimics) appears to be free, but I’m still awaiting confirmation for downloading. Additional software, such as MeshLab and Autodesk Maya are freeware, at least for trial versions.

Clack et al. set out to build a method of digital reconstruction that builds upon previous methods, giving greater geometric accuracy. The methods revolve around using a digital mesh obtained through laser or C-T scanning as a model for a landmark-based geometric reconstruction. The sample specimen is a vertebra from the infamous tetrapod fossil Acanthostega. Only one half of the vertebra is actually preserved, therefore this was digitally reconstructed and attached to its mirror image, creating a bilaterally symmetrical three-dimensional element.

Landmark selection involved a mixture of Type 1 and Type 2 landmarks; that is topographically homologous points, mixed with sites of geometric significance, such as local maxima or minima of curvature. These were used as the basis for constructing a surficial grid of contour lines describing the medial and lateral geometry of the neural spine. Videos of the processes involved are actually available online, embedded within the article, a really awesome and useful addition, making the whole methodology more transparent and easier to replicate, should you wish. There’s not really much else to say about the methodology; the processes, such as modelling and surface extrapolation are laid out systematically and reasonably easy to understand for anyone with an understanding of the concepts of geometry and fossils.

The resultant reconstructions are high quality, smooth and geometrically faithful in representing the original vertebra in three dimensions, free of any taphonomic deformation or distortion, and with missing parts accurately reproduced. The groups of models created are validated using Procrustes superimposition and principal components analysis, two standard statistical techniques. The first two principal components do appear to have a low explanatory power however (PC-1 = 24.3%), which may be an issue relating to the complexity in the form of the vertebra. The authors are right to discount the use of the thin-plate spline technique, as this is known to be misleading in that the deformation patterns it produces are homogeneous with respect to the landmark configuration, leading to potentially false morphological variation in areas of no data, something which is largely overlooked.

Acanthostega model reconstruction, half-fish half-muppet; Copyright - Eliot Goldfinger

The advantages of the techniques explored here are in the handling style of the models, and their statistical power and accuracy. Furthermore, anyone can conduct or replicate these methods, providing they have access to an initial CT scan. The potential applications are numerous too: digital models of reconstructed elements can give more accurate parameters for biomechanics where data may have been previously extrapolated in a subjective or qualitative manner; it may yield hitherto unknown data for character construction, which may in turn increase the validity of phylogenetic analysis. The landmark mapping procedure may need refinement in terms of increasing the number of points, such as by using semi-landmarks, which will more accurately reconstruct the surface geometry and open the way for other statistical procedures.

The study represents a great step forward though in accurate specimen reconstruction, and reveals another field in which the power of geometric morphometric techniques is unparalleled. A limitation could be that to reconstruct missing parts, you have to have an idea of what the gross geometry is, meaning at least one half of a bilaterally symmetrical element must be present. This means that if you wanted to reconstruct the neural spine for example, it would be impossible if the whole part was absent, even if the entire centrum was preserved. This is something that could be integrated in future using close relatives of the species that are being reconstructed.

What is a Fossil Species..?

What do we currently understand by a ‘species’?

Naming species, also known as alpha taxonomy, forms the fundamental basis and core of systematic analysis (e.g., for biodiversity, macroevolutionary and ecological studies). Since the origin of the species concept, there has been heated and continuous debate as to what exactly constitutes a species. The discovery of DNA as an evolutionary tool sparked a vigorous new line of discussion into what precisely defines a species. Even to this day, despite a wealth of theoretical, empirical and philosophical studies, there is still a lack of consensus in the way of rigorously defining a species unit. This is not to say that there isn’t a general idea of what a species is (ask any biologist or palaeontologist); in fact most people reading this will probably have a pretty good idea of what they define a species as. But there is not total agreement, not by a long shot. Furthermore, most if not all current species concepts are explicitly based on extant organisms which can be directly observed in their every day life, and also just happen to provide a near-endless supply of DNA. But what about fossils? I’ve outlined the critical importance of using fossils in conjunction with pretty much any systematic analysis before (here), but how do palaeontologists actually recognise and delimit fossil species? This is a pretty serious issue, considering the DNA of fossil organisms has always decayed long before exhumation (except in exceptional circumstances), and fossil remains typically only represent a biased sample of the organism it once was.

What are the current species concepts?

For biologists, the species problem can be framed as: “What level of divergence (morphological, genetic, etc.) between populations constitutes species diagnosis?” This can be modified slightly according to whichever species concept is being applied (see below). Using DNA as a sole basis for species delimitation is fraught with issues, including but not limited to the concept of paralogy, lateral gene transfer (transfection), arbitrary delimitation protocols, lack of data (e.g., in tropical species), and often a lack of training or instrumentation (in third world countries mainly). The relative issues and benefits of morphological and/or DNA-based analysis is a tale for another time though. Currently, there is no single ‘silver bullet’ technique for species delimitation (although many DNA taxonomists will try and pretend there is..). What we actually have are a series of non-independent concepts that actually apply to different stages of the speciation process (de Quieroz 2007 discusses this in a most brilliant manner). Here are a couple of examples:

Biological Species Concept: This is the one most people will have heard of. Species are defined by reproductive isolation, or the ability to produce fertile offspring. Obvious issues with this are if you’re asexual, and how do you know if two organisms (within reason) can or cannot mate if they are not sympatric. Also, reproductive isolation is not always congruent with morphological divergence, so is inadequate with purely morphological data sets.

Phylogenetic Species Concept: This refers to diagnosability based on the monophyly of a population. This invariably invokes the use of DNA. Genetic population divergence goes through three stages: polyphyly, paraphyly and finally reciprocal monophyly, giving two or more irreducible clusters of diagnosable organisms with a traceable pattern of ancestry and descent.

Genealogical Species Concept: This is the use of multiple gene marker distributions to delimit putative species by identifying periods of complete lineage sorting. Essentially this means that the incongruence from coalescence (the point in time where gene variants unite in a gene genealogy) no longer affects delimitation.

A currently widely used method is DNA barcoding. Some molecular systematists deem this as a powerful enough tool to entirely replace standard Linnaean taxonomy, although (obviously) there are numerous vocal objections. DNA barcoding operates on the assumption that there is a threshold for species delimitation based on a single gene, which is the entirely arbitrary 10 times greater genetic divergence (interspecific) than intraspecificity, leading to the concept of reciprocal monophyly. It works sometimes, but is fraught again with theoretical and empirical problems. (I love the idea that molecular systematists will go to the tropics with the aim of identifying unique or diverse haplotypes in insects etc., by killing as many organisms as possible; “We’ve found a unique haplotype! We must therefore preserve this beetle at all costs!”, as the decapitated beetle floats around the dissection palate..)

How do these concepts relate to fossils?

Every single one of these concepts rely on either direct observational data (e.g., sympatry for the BSC), or the use of DNA. Few modern studies rely solely on morphology to delimit species (annoyingly, seeing as it is directly coupled with behaviour, ecology etc.; DNA is just, well, DNA..). So really, with regards to fossils, in which phenotype is the only aspect preserved (and ecology etc. accordingly inferred), as well as the spatio-temporal context in which it exists, how can these concepts be applied? Well, they can’t really. So what can palaeontologists do..?

How are fossil species delimited?

In principle, there are two different methods of species delimitation: a discovery-based approach, and a hypothesis-based approach. The former makes no a priori assumptions regarding the putative species in a sample, only delimiting subsequent to analysis (e.g., DNA barcoding/taxonomy, cladistics). The latter requires an a priori assumption of what species already exist within a sample, with the analysis being a validation test. It varies in papers as to whether a full or partial cladistic analysis is carried out (if at all) when the focus if the paper is the erection and description of a new species. By partial analysis, I simply mean that the authors observe the synapomorphies of a specific clade and see if their specimen(s) match or not. This is a pretty horrendous breach of taxonomy and cladistic methodology, as it ignores the fact the every single character placement and it’s polarity is influenced by the addition of new species (in fact, this is the principal method by which cladograms are initially constructed). Full analysis is the dominantly used method, thankfully, given the accessibility of free software and relative simplicity in executing cladistic analysis (although there may be issues in obtaining and extracting previous data sets, but that’s another tale too. For someone else.) This leads us on to the next part.

Bring on Cladistics

Cladistics is the method that sytematists use to forge a hierarchical grouping of taxa into discrete subsets, or clades, for the inference of common ancestry between species and groups. A clade is defined by a node (or sometimes a branch) – the point of intersection of two or more branches – that represents the common ancestry and speciation of all subsequent taxa. Each node is represented by one or more shared derived characters (synapomorphies) between all branches, and hence taxa, emanating from the node. If the taxa in question are species (i.e., terminal branches), then the minimum required number of synapomorphies to give a sister taxa relationship is one, and the minimum number of required autapomorphies (unique derived characters) to ‘split’ the branch into two separately recognised entities, is one. That is, cladistics can recognise discrete units, including species, on the basis of a single unique character, regardless of the size of the initial character set. There are statistical methods of assessing the strength or support of this (e.g., pseudo-replication analyses, branch decay tests), but the point remains that a species can be delimited through cladistic analysis based on the possession of a single unique character. [this is a really simple overview, there are numerous web-pages and texts out there that describe cladistic methodology in more detail; just search.]

It seems that there are two main methods of delimiting fossil species: qualitatively, whereby the fossil simply looks different but the differences are not broken down into discrete characters; and quantitatively, where the species name is supported by x number of autapomorphies, and the strength or support of the diagnosis is a function of x, and is testable through cladistic methods. This is pretty much the only method available to palaeontologists given the relative paucity of fossil data. But then how many autapomorphies are required to be interpreted as a ‘strong’, or valid, diagnosis? And to what extent are species therefore comparable? It’s a problematic issue, that I haven’t actually came across much at all in the published literature. If I’m mistaken, please do point me in the right direction! What is perhaps required though, is a rigorous species concept that is directly compatible with the full range of fossil diversity, and that extant taxa can be integrated in to.

One thing to consider though is that species are treated as discrete entities when these concepts are applied; is this the correct approach  when really a lineage on which an organism sits is by definition, continuous? What do we gain by stamping an arbitrary and highly subjective boundary on this continuum? A method of classification. It has heuristic value in systematics, but it seems that the fundamental treatment of species as discrete units may need some consideration. Furthermore, speciation is a pretty stochastic and deterministic process, and the application of delimitation criteria must be flexible to account for the variation between lineages. Unless someone comes up with something really neat. Like..

Future prospects? Geometric Morphometrics. 3D automated species recognition software, based on robust statistical delimitation procedures. It’s awesome. Watch this space!

Disclaimer: I’ve probably missed out huge amounts here; this is such a massively studied field, that it’s been difficult to even shrink down to these couple of paltry pages! Comments as always are more than welcome! There are simply too many references to list here too. If people would like to read more about the subject, drop me an email (jon.tenannt.2[at]gmail.com), and I’ll happily whizz a few papers [legally..] your way, depending on taste!

Final thought: with respect to all of the work that has gone into validating ‘species’, what has been done to test the validity of higher taxonomic units, such as Family and Order, or even the Genus..?

For reading all that, here’s a snap of the Iguanodon specimen on display at the museum in Oxford, England.

Surprised American for scale

Quantitative Shape Analysis 2: Data Collection – easier than you think!

If you want to inspire confidence, give plenty of statistics. It does not matter that they should be accurate, or even intelligible, as long as there is enough of them.” Lewis Carroll (1832-1898)

 

The last post on this series gave an introduction to the background and significance of quantitative shape analysis. I conveyed the use of landmarks, or geometric co-ordinates, as the basis for statistical analysis of shape. The last article finished by stating this article would discuss different methods of geometric morphometric analysis, but I forgot one crucial step: Data Collection! Here, I present a simple and efficient way of collecting data for use as the basis for a range of geometric morphometric analyses.

Following is an example of data collection from a simple coursework study I did last year, looking at cranial allometry in carnivores. Firstly, you need a target or hypothesis for your analysis. The target here was to use exemplar carnivorous mammal species to look at shape variation in the skull, and to interpret in terms of form and dietary function. The first decision to make is what points to use as your landmark data. I’ll use a hypothetical skull as an example.

Chosen selection of landmarks - you can chose any, as long as they are topographically correspondent between all specimens as described in the last post

 

Each one of these landmarks represents a specific topographically correspondent point amongst all specimens in the sample. For the sake of simplicity in this example, assume that the lower jaw and the cranium are a single module. The landmarks can be defined as such:

Cranial landmarks (right-lateral aspect; red)

1. Posterior extremity of occipital margin (type 3)

2. Tympanic aperture (centre) (type 1)

3. Posteroventral extremity of occipital condyle (type 3)

4. Ventral extremity of dorsal postorbital process (type 3)

5. Rostral extremity of orbital periphery (type 3)

6. Mid-point on ventral maxillary margin between premolars and canines (type 3)

7. Ventral deflection in dorsal margin (maximum curvature) [rostral to postorbital] (type 2)

8. Dorsal expansion in dorsal margin (maximum curvature) [posterior to external nares] (type 2)

9. Anterior extremity of premaxilla (type 3)

10. Dorsal extremity of dorsal margin (type 3)

11. Ventral extremity of zygamatic arch-jugal suture (type 1)

12. Position of distal border on posterior-most tooth (maxillary) (type 1)

 

Lower jaw landmarks (right-lateral aspect; blue)

13. Posterior extremity of angular process (type 3)

14. Posterior-most (distal) extent of dentary molars (type 3)

15. Mid-point on ventral dentary margin between premolars and canines (type 3)

16. Anterior extremity of dentary (type 3)

17. Point of posterodorsal deflection of ventral margin, culminating in angular process (type 2)

18. Ventral pinnacle of coronoid process (dentary) (type 2)

 

This is just a hypothetical example to show landmark positions and how to define them. Real data is freely available for almost anything on the internet. A series of sample images can be easily obtained through MorphBank, for example. If anyone reading this would like, I can send them a copy of the images I used for this coursework as a trial data set – just drop me a quick message with your email address.

Converting these landmarks into usable geometric data is possible through a number of image modification programs. A good one to use is ImageJ, freely available on the web. An important thing to note at this point is that within your image collection, every one you import into this program or any other must be angularly identical, or as close as possible (e.g., all of a precise lateral view of a skull).

Using ImageJ you can simply import an image with pre-defined landmarks as above, use the ‘Point’ tool to click on the landmark, hit ctrl-m (or use the Analyse-Measure tab), and hey-presto, you have the two-dimensional geometric co-ordinate of that point in a table! Consecutive points can then be added to this table for each specimen. Do this for all points per sample in a pre-defined numerical sequence (as indicated above), then simply export to an Excel spread-sheet. A rather nifty thing you can then do is plot them as a graph, and you’ll see a landmark representation of your image (awesomeness of this depends on how many landmarks you use). Repeat for all samples, and you have a comparable data set. Simple eh! Note that this can be done free of scale, so you don’t need to measure any lengths or inter-landmark distances. A future post will cover how to compensate for this in quantitative shape analysis. What you want to end up with at this stage is a single spread-sheet, with a labelled tab for each specimen, and containing a series of geometric co-ordinates that are topographically related between specimens.

There are of course more techniques using more complex software and data imaging methods (using surface or outline data, laser scans etc.), but typically these will not be accessible to the general public. The above procedure is a convenient and free method of obtaining a decent and workable initial data set, without having to spend endless time in a museum collection or laboratory.

So, now you know the procedure, nothing should stop anyone from going out there and collecting a data set, constructing a series of landmarks and digitally obtaining their geometric co-ordinates. Right? Next time, I’ll actually discuss how to assemble this data into a format that you can use to input to some free software, and several analyses you can then conduct with this software (e.g., Principal Components Analysis).

Why Geometric Morphometrics Kicks Ass 1

Those of you who have read my recent articles will probably have noticed the phrase ‘geometric morphometrics’ a few times. When mentioned to people, the usual reaction is to melt or run away screaming satanic verse and tearing chunks of hair out (pers. obs). This is largely due to the pretty intense mathematical basis behind the huge variety of implementable statistical procedures, which range from simple linear regressions to more complex 3D extended-eigensurface analyses. Each of these essentially provides a quantitative method of analysis of biological structures that can be interpreted in terms of biological function, a pretty crucial aspect of both zoology and palaeontology. To get to grips with the necessary analytical tools, it’s not really important to dig into the fundamentals such as how to construct a covariance matrix – these are well-defined mathematical concepts. The aim of the following few posts is to break down what is one of the most powerful yet under-used tools available for bio-structural analysis. What I’d also like to achieve is some kind of informal discussion about ideas in which geometric morphometrics can be applied to simple ‘pilot’ analyses based on freely available information, such as photographs from Morphbank. This first post will deal with the initial concepts, and future articles will provide examples of the different methods and tools available. Hopefully you will find this useful, and begin to openly develop and understand the processes involved.

Geometric morphometrics, as you might infer from the name, is the statistical analysis of form using geometric co-ordinates, or Cartesian landmarks. Form is defined here as the total dimensionality resulting from both shape and size. Size is the totality of spatial dimensions within a form, and shape is defined as the aspect of a form’s geometry that remains after scale (i.e., size), position (translation) and rotation have been normalised. Shape is essentially a localised metric for describing variation of spatial dimensions. Distinguishing between these is actually pretty crucial, as typically in an analysis you will want to differentiate between size and shape. Recently, the field has accelerated in strength due to the ability obtain three-dimensional structures such as skulls using techniques such as computer-tomography (CT) scanning. This considerably increases the information available for geometric morphometricians, and has led to numerous concurrent methodological adaptations in order to rigorously process available data. Using 3D techniques is kind of like a ‘total evidence’ approach to form analysis.

My personal opinion is that geometric morphometrics completely out-strips traditional morphometrics in terms of theoretical strength, methodology, and explanatory power. Consider simple linear measurements for starters, in for example, describing a lateral view of any mammalian hind-limb. You can imagine all sorts of bisecting, parallel, oblique and orthogonal measurements that would aid reconstruction of the form. Collecting these measurements and the relative angles would be time-consuming however, especially if you were looking for example, at sexual dimorphism of the femoral head in an antelope population. The world’s supply of coffee would be extinct before completion. However, with a simple photograph and the right software, breaking down a femur into a geometric outline or surface that you can use for all sorts of morphometric wizardry takes seconds (not including the months it takes until you are granted access to specimens).

Ratios are also a statistical over-simplification. The combination of measurements that can produce the same ratio is constrained only by the size of an object, and furthermore, ratios are a gross under-estimate of the potential geometric complexity of an object; try and imagine modelling a sine-wave with a linear ratio (or as complex a morphological structure as you want). Not going to happen is it.

The core of geometric morphometrics revolves around the assessment of allometry. Allometry is an ubiquitous aspect of nature, describing how organisms change their form. Discovering and interpreting allometry is the proximate target of most investigations, with the null hypothesis being isometry: no shape variation with respect to size. Typical investigable targets include detection of heterochronic trends, called paedomorphosis or peramorphosis, relating to the timing of acquisition of certain structures in an organism’s or species’ history (essential for evo-devo analyses).

Landmarks form the principal units of analysis for geometric morphometrics. Landmarks are formally known as Bookstein shape co-ordinates, with a defined Cartesian geometric position (i.e., x, y, z variables). Landmarks represent a subset of possible locations or distances, based on the nature of sampling. The only problems with this approach include difficulties in recognising landmarks, missing data, and possible redundance of data due to over-lapping inter-landmark spacings.

The data coverage available through morphometrics is exquisite! Each point is an individual landmark

REALLY IMPORTANT: Landmarks are NOT homologous sensu stricto. They represent topographically correspondent ‘characters’ – you should be able to write down the exact location in an unambiguous manner. This is really important when it comes to the biological interpretation of data.

There are three types of Cartesian landmark. Although not necessarily that important, it can provide an idea of how geometrically faithful a representation of an object you have.

Type 1: These represent the juxtaposition of biological components, such as sutures.

Type 2: These represent geometric aspects of form, such as local maxima or minima of curvature. These and type 1 landmarks are typically used in structure-based analyses.

Type 3: These are co-ordinate dependant equidistant interpolations, such as mid-points between two type 1 landmarks. They are also known as semi-landmarks, and are typically used in refining shapes such as profile outlines.

Landmarks form the cornerstone of all geometric morphometric analyses. What they provide you with is an unambiguous and quantitative dataset, that most importantly is highly informative in terms of biological structure. With landmark data, the wealth of potential modes of analysis at your disposal is phenomenal, as are the available software packages. One I would highly recommend is the tps series that can be found here, as well as a rather comprehensive overview of all things geometric.

In the mean-time, I wish everyone here an awesome 2012, and try not to get apocalypsed/raptured. I’ve popped a few references at the bottom here regarding the recent application of morphometrics in the field of vertebrate zoology, definitely worth reading a few just to get to grips with how scientists are currently using the techniques. I also strongly recommend the PalaeoMath series by Norm MacLeod, freely available here through the Palaeontological Association. It’s good stuff, and includes data so you can try your own analyses!

Next time: Principal Components Analysis, Principal Co-ordinates Analysis, and Procrustes superimposition.

Barden, H. E. and Maidment, S. C. R. (2011) Evidence for sexual dimorphism in the stegosaurian dinosaur Kentrosaurus aethiopicus from the Upper Triassic of Tanzania, Journal of Vertebrate Palaeontology, 31(3), 641-651

Brusatte et al. (2011) The evolution of cranial form and function in theropod dinosaurs: insights from geometric morphometrics, Journal of Evolutionary Biology, DOI: 10.1111/j.1420-9101.2011.02427.x

Goswami, A., Milne, N. and Wroe, S. (2010) Biting through constraints: cranial morphology, disparity and convergence across living and fossil carnivorous mammals, Proceedings of the Royal Society B, doi:10.1098/rspb.2010.2031

Hadley, C., Milne, N. and Schmitt, L. H. (2009) A three-dimensional geometric morphometric analysis of variation in cranial size and shape in tammar wallaby (Macropus eugenii) populations, Australian Journal of Zoology, 57, 337-345

Fossils – So Much More Than Just Pretty Dead Things

What do fossils tell us? It’s an obvious question, commonly phrased as ‘What is the point in studying fossils?’, but often can be one of the more difficult ones to answer objectively. The most prominent reason, that I’m sure a lot of people will agree with, is that we want to know what ancient and often extinct organisms looked like. This promise of discovering the unknown is what captivates people from a young age, and often motivates them in to studying fossils as a profession. From fossils, we can infer ecological aspects such as behavioural interactions, feeding strategies, and predator-prey relationships, and how these factors all changed through time. Tracking and reconstructing the co-evolution of the Earth and its biota is one of the most magical and beautiful stories ever to be told.

However, fossils can provide so much more than just aesthetic pleasure. If this wasn’t so, it would make grant proposals incredibly difficult – people don’t usually like giving away money just so a fanatic can play with fossils all day. So, palaeontologists have developed numerous excuses to satisfy funding bodies, to show that studying fossils actually has some scientific value.

The real motive is going “WT*expletive deleted*????” Sheer curiosity. Then, this develops into seeing that there was a *expletive deleted*load of stuff that was WAY different – how did this work? What does this tell us about how stuff works today? Mallison (2011)

Following are additional reasons why the study of fossils is not only awesome, but also indispensible in our understanding of biological and geological evolution.

Lineage Reconstruction

This is perhaps the most important use that fossils have for evolutionary biologists and palaeontologists. While genetic analysis might tell you about the particular history of a gene or genome, or the genetic evolution of species or populations (there are key fundamental differences between gene-trees and species-trees, something which molecular systematists miss out A LOT), they tell you virtually nothing about the phenotypic, or morphological evolution within a lineage. We don’t have many fossilised genetic markers (except in exceptional circumstances from permafrost-preserved mammoths), and thus must default to morphological analysis when tracking lineage evolution. While methods do exist for estimating and modelling the temporal evolution of species with respect to their genetic make-up, these can never provide such solid evidence as fossils can in terms of reconstructing ancient organisms, and the evolutionary trajectories leading to what we see surrounding us today.

The next two points are largely based on cladistic methodology. For a nice summary of cladistics, it’s worth quickly checking the following Wikipedia entry here. Essentially, cladistic analysis is the primary method for reconstructing cladograms, or trees, that represent the systematic and hierarchical classification of organisms. Note, that cladograms are not to be confused with phylogenetic trees, in which explicit evolutionary trends are inferred (i.e., patterns of ancestor-descendant relationships).

Novel Extinct Morphologies

Cladistics is based on the analysis of characters, which are formally broken down into character states. A character is essentially an aspect of morphology which can be expressed as a number of mutually exclusive variables, or character states. This forms the basis for analysis of species’ relationships and homology assessment. An example of how this can be expressed is:

Maxilla, anterior process, length: shorter (0) or longer (1) than the posterior process (taken from Sereno, 2007)

Now, if you want to reconstruct the phylogeny of any extant group with extinct members using just living members of the group and using just morphology, then you would directly neglect the unique character combinations that fossil species exhibit. This is important because, as a general rule (there are exceptions) the more characters included in a cladistic analysis, the greater the resolution achieved. Fossils can also provide transitional morphologies between species and additional information in areas of low resolution, and therefore resolved relationships are more evolutionarily stringent. Missing out the morphological information contained within fossils constitutes a severe case of neglect, and also disregards one of the most important aspects of any evolutionary analysis: time.

Character Polarity

As shown above, characters are broken down into various character states representing variations of a particular aspect of morphology. One of the main goals of cladistic analysis is to resolve the sequence of evolutionary transformation of these particular character states. If we increase the complexity slightly to include three variables, the character becomes known as ‘multi-state’. Keeping in line with the example shown above, one possible character is:

Maxilla, anterior process, length: shorter (0), identical (1) or longer (2) than the posterior process

Note that this is a purely hypothetical example to illustrate the point. To ‘transform’ from one of these character states to an adjacent one (i.e., 0<->1 or 1<->2) it costs one ‘step’ with the implication that it costs more to transform from 0<->2, and must pass through a transitional stage, character state 1. This is known as character ordering, and represents the directionless sequence of evolutionary transformation. However, what we want to know is the direction of character state transformation, to tell if a particular character state is the derived (apomorphic) or primitive (plesiomorphic) condition. This is achieved by polarising characters, and is where fossils play their part. As fossils are explicitly related in terms of chronostratigraphic age, this can automatically impose an evolutionary trajectory on character state polarity (i.e., the older fossils have the plesiomorphic state). This can also be achieved by ‘rooting’ a cladogram through outgroup assignment, which is an a priori determination of the plesiomorphic conditions through fossils; this is actually explained quite nicely here. The main point is that fossils perform a critical role in inferring sequences of phenotypic evolution.

Sampling Diversity

Now, one thing I’m sure palaeontologists are tired of hearing over and over is that the fossil record is biased in numerous ways (i.e., regarding sampling biases). Numerous studies have recently been undertaken to overcome these apparent biases, the most recent and critical of which is Hannisdal and Peters (2011). This paper explains how many of the patterns of fossil diversity we observe during the Phanerozoic can be explained by covariation between ancient biotas, sedimentation rates, and Earth system dynamics (e.g., ocean redox). Thus fossils, and the way in which we interpret them, are proving to be influential in how we interpret the co-evolution of, for example, biochemical and tectonic patterns, and contiguous biota assemblages.

The fact remains that, yes, the fossil record is biased. But now we can compensate for and use it to nurture our understanding of geological processes in deep time. On the other hand, we have molecular systematists who consistently use the excuse of the ‘incomplete and biased’ nature of the fossil record to completely disregard the use of fossils, and assume that DNA-based analyses are adequate. This is actually pretty ironic, considering extant organisms (i.e., those we can extract DNA from) represent a single time slice containing a fraction of the total species that have existed on the Earth since life began, and is therefore the most biased sample of all. Hypocrisy, thy name is deoxyribonucleic acid. A recent example of this is Ericson (2011), in which fossils are neglected from the study entirely (with only a brief mention), thus compromising the accuracy of all results obtained (making inferences about Mesozoic palaeobiogeographic patterns without consulting the fossil record is pretty offensive). So, analysing and incorporating fossils into diversity analyses actually decreases relative sampling bias, and increases the empirical and theoretical validity of studies. Ignoring the fossil record for a biogeographical, phylogenetic, or any other evolutionary study is counter-productive, and pretty much blasphemy.

Breaking of Long Branches

Long branch attraction is a fairly common side-effect of genetic-based phylogenetic analysis, typically occurring in when invoking parsimony. It arises as the result of highly rapid divergence between multiple lineages, and due to the limitations of nucleotide substitution (i.e., four possible character states) can lead to misinterpretation of homoplastic sites (e.g., through reversals, parallelisms, or convergences of states) as homologous (orthologous) sites. This can lead to erroneous inferences about the evolutionary (i.e., topological) distances between lineages. Although using advanced modelling methods such as Maximum Likelihood or Bayesian analysis can partially resolve this issue with genetic data, fossils can also be used to ‘break up’ long branches by calibration against a particular lineage in deeper time (specifically in morphological analysis), or by providing information in areas of limited information, ultimately improving phylogenetic accuracy. This is another example of the limitations of molecular-based analyses, with analogous issues in morphological analysis being quite well understood and resolved (see Cobbett et al., 2007 for a nice discussion about including fossils in cladistic analysis).

Calibrating Molecular Phylogenies

Molecular phylogenies are becoming increasingly used to estimate divergence times of major clades and as the basis for assessing temporal dynamics in within- and between-group diversification. However, using site-substitution rates alone to estimate the temporal origin of a clade (i.e., a node) is a poor estimation, regardless of the complexity of the models employed. Therefore, fossils with a strongly supported or well-defined taxonomic status can be used to calibrate the minimum origins of a particular clade, in a strict spatio-temporal context. This bypasses several assumptions made by models, such as stochastic or constant rates of site substitution, and is therefore an invaluable tool for accurately reconstructing phylogenies. Accordingly, the integration of ‘metadata’ (such as stratigraphy, or relative or absolute ages) is essential in reconstructing accurate phylogenetic relationships. It can also reveal additional crucial factors, such as the rates of phenotypic evolution, and how particular functional characters or morphological domains covary through geological time.

The above examples are just several of the more significant reasons why studying fossils is crucial, and how upon further critical analysis can yield unparalleled detail about the evolutionary history of life on Earth. It is worth noting that, although there are drawbacks and advantages to studying either the fossil record or the genetic evolution of extant taxa, it is when both are integrated that a more complete picture of global evolution emerges. Fossils are prominent in this reconstruction based on the unequivocal increased accuracy gained, but possibly at the cost of decreased resolution, due to the incomplete, patchy and biased nature of the fossil record.

To finish, I’ll quickly mention the concept of uniformitarianism: “the present is the key to the past”. This may be, in many cases of natural processes, but the past is also key to unlocking how it is the present transpired, and furthermore, in predicting future patterns of biotic diversification. To neglect the fossil record is to discard the one solid piece of evidence that we have in understanding global biotic responses to very real scenarios such as global warming.

 

Further reading

Butler, R. J., Benson, R. B. J., Carrano, M. T., Mannion, P. D. and Upchurch, P. (2011) Sea level, dinosaur diversity and sampling biases: investigating the ‘common cause’ hypothesis in the terrestrial realm, Proceedings of the Royal Society, Biological Sciences, 278, 1165-1170

Cobbett, A., Wilkinson, M. and Wills, M. A. (2007) Fossils impact as hard as living taxa in parsimony analyses of morphology, Systematic Biology, 56(5), 753-766

Ericson, P. G. P. (2011) Evolution of terrestrial birds in three continents: biogeography and parallel radiations, Journal of Biogeography, doi:10.1111/j.1365-2699.2011.02650.x

Hannisdal, B. and Peters, S. E. (2011) Phanerozoic Earth system evolution and marine biodiversity, Nature, 334, 1121-1124

Sereno, P. C. (2007) Logical basis for morphological characters in phylogenetic analysis, Cladistics, 23(6), 565-587