AP Syllabus focus:
‘New data can support, refine, or refute phylogenetic hypotheses about evolutionary relationships.’
Phylogenies are not fixed “answers” but testable explanations of relatedness. As new genes, fossils, and analytical methods appear, scientists evaluate whether a proposed tree remains the best-supported hypothesis and revise it when warranted.
What it means to test a phylogeny
A phylogeny is evaluated like any scientific model: by asking whether its predictions match independent evidence and whether alternative explanations fit the data better.
Phylogenetic hypothesis: A proposed, evidence-based explanation of evolutionary relationships among lineages, typically represented as a branching tree that can be tested and revised with new data.
Testing focuses on whether the hypothesised branching order and groupings are consistently supported across datasets and analyses, and whether uncertainty has been quantified appropriately.
Sources of new data that can change support
New evidence can increase confidence in a tree or reveal that a different topology is more likely.
Molecular datasets
Adding additional loci (more genes) reduces overreliance on any single gene history.
Using whole-genome or large multi-gene alignments increases statistical power.
Incorporating rare genomic changes (e.g., specific insertions/deletions) can provide strong, low-homoplasy signals in some groups.
Morphological and fossil information
Newly discovered or reinterpreted fossils can clarify the sequence of trait evolution and reveal intermediate character combinations.
Improved character coding (more careful definitions of traits and states) can alter inferred relationships, especially for extinct taxa where DNA is unavailable.
Broader taxon sampling
Including more species can break up long branches, reduce artefacts, and reveal previously unsampled diversity.
Adding appropriate outgroups can change inferred character polarity and rooting, which can reshape interpretations of relationships.
How hypotheses are tested (methods and comparison)
Testing commonly involves analysing the same dataset using multiple reasonable methods and checking whether key clades remain supported.
Comparing alternative trees
Scientists may explicitly compare:
A focal tree versus competing topologies (different branching orders)
Trees inferred under different assumptions (e.g., different evolutionary models)
Evidence for revision is strongest when:
Multiple independent datasets favour the same alternative topology, and
The alternative provides a clearly better fit according to accepted criteria for that method.
Assessing uncertainty and support
Because any dataset is finite, robust testing includes measures of confidence.
Common approaches include:
Bootstrap resampling: repeatedly re-analysing resampled versions of the alignment to estimate how consistently a clade appears.

Example of a phylogeny annotated with bootstrap support values on internal edges. The numeric labels represent how often each clade is recovered across many resampled replicate datasets, providing an operational measure of confidence in specific groupings. Source
Posterior probabilities (in Bayesian analyses): estimating the probability of clades given the data and model.
Reporting polytomies (unresolved branching) when data do not strongly discriminate among alternatives, rather than forcing resolution.
Why different datasets can disagree
Conflicts do not automatically falsify a phylogeny; they identify biological or methodological issues that must be addressed.
Biological causes of discordance
Incomplete lineage sorting: ancestral variation persists through rapid speciation, causing gene histories to differ from the species history.

Diagram of incomplete lineage sorting (ILS), where ancestral alleles persist through closely spaced speciation events. Because alleles coalesce deeper than the species split, a single gene tree can group taxa differently than the true species tree, producing apparent conflict among datasets. Source
Hybridisation/introgression: genetic exchange after divergence can make some genes suggest closer relationships than the species tree.

Examples of phylogenetic network visualizations that incorporate introgression as reticulation edges rather than strictly bifurcating branches. Networks like these are used when gene flow between lineages violates a simple tree model, helping explain why different loci may support different topologies. Source
Horizontal gene transfer (especially in prokaryotes): genes move between lineages, obscuring a single branching history.
Methodological causes of discordance
Homoplasy: similar traits or sequences evolve independently (convergence/parallelism), misleading analyses if treated as shared ancestry.
Long-branch attraction: rapidly evolving lineages may cluster together erroneously in some analyses, especially with sparse sampling.
Poor alignment quality or inclusion of non-homologous regions can inject noise that mimics signal.
Revising phylogenetic hypotheses responsibly
Revisions are most defensible when they are transparent, reproducible, and conservative about what the data can actually resolve.
Best practices in revision
Reanalyse with updated models and clearly stated assumptions.
Increase taxon and character sampling rather than relying on a single “best” dataset.
Use multiple independent lines of evidence (e.g., several unlinked genes; morphology where appropriate).
Distinguish between strongly supported changes and areas that remain uncertain (retain ambiguity where needed).
What “revision” can look like
Refinement: the same major clades remain, but branching order within a clade changes with stronger support.
Support shift: a previously accepted clade loses support and is replaced by a different grouping.
Refutation: strong, repeated evidence shows a key relationship is incorrect, requiring a new hypothesis for that part of the tree.
These revisions directly reflect the syllabus focus that new data can support, refine, or refute phylogenetic hypotheses about evolutionary relationships.
FAQ
Bootstrap support reflects how often a clade appears across resampled datasets, so it is a stability measure under resampling.
Posterior probability estimates the probability of a clade given the data and the chosen model, and can be sensitive to model priors and fit.
Long-branch attraction is an artefact where fast-evolving lineages cluster together erroneously.
Ways to reduce it include:
adding taxa to break long branches
using better-fitting models of sequence evolution
removing saturated sites or poorly aligned regions
They may look for:
gene trees where only certain loci place taxa together
asymmetrical allele sharing patterns across the genome
discordance concentrated in genomic regions expected to move via hybridisation
These patterns suggest gene flow rather than a true species branching order.
If divergences occur in quick succession, there may be little time for diagnostic mutations to accumulate, and ancestral variants can persist across splits.
This produces many plausible gene histories, so the safest revision may be partial resolution (polytomies) rather than a fully bifurcating tree.
Different models make different assumptions about mutation rates across sites/lineages. Poor model fit can bias likelihood-based inference and inflate support for incorrect groupings.
Model testing/selection (and sensitivity analyses across models) helps determine whether a proposed revision is robust or model-dependent.
Practice Questions
Explain why phylogenetic trees are described as hypotheses rather than facts. (2 marks)
States that a phylogenetic tree is a proposed explanation/model of relationships based on evidence (1)
States that it is testable and can be revised/refuted when new data or analyses provide different support (1)
A published phylogeny based on one mitochondrial gene groups species A with B. A new study adds multiple nuclear genes and finds A groups with C. Describe how scientists would test whether the original hypothesis should be revised. (6 marks)
Recognises that adding independent loci (nuclear genes) provides additional/independent evidence (1)
Mentions comparing alternative topologies using appropriate phylogenetic analyses (1)
Describes assessing support/uncertainty (e.g., bootstrapping or posterior probabilities) for the competing clades (1)
Notes checking for biological causes of discordance (e.g., introgression, incomplete lineage sorting) (1)
Notes evaluating methodological issues (e.g., model choice, alignment quality, long-branch attraction, taxon sampling) (1)
Concludes that consistent stronger support across datasets/analyses would justify refining or refuting the original hypothesis (1)
