AP Syllabus focus:
‘Traits gained or lost during evolution are used to construct phylogenetic trees and cladograms.’
Phylogenies are built by tracing how heritable traits change across lineages. By identifying where traits were likely gained or lost, biologists infer relatedness and propose the most supported branching pattern among taxa.
Core idea: inferring ancestry from trait changes
Phylogenetic construction treats evolution as a history of character-state changes along branches. Closely related organisms tend to share more recently evolved traits, while distant relatives share fewer.
Characters and character states
A character is a heritable feature used for comparison (morphology, development, molecules). Each character can occur in different states (e.g., “tail present” vs “tail absent”).
Character state: A particular form of a heritable trait used in phylogenetic analysis (for example, a specific nucleotide at a site, or presence/absence of a structure).
To build a tree, scientists compile a character list across taxa, then ask: what arrangement of branches requires the fewest and most plausible gains and losses of these states?

A taxon–character matrix coding multiple vertebrates for binary character states (0/1) such as “four legs” and “amniote egg.” Matrices like this are the raw input for many phylogenetic methods, which then infer where character states changed on the tree. The pattern of shared derived states across taxa is what ultimately supports proposed clades. Source
Gained traits: shared derived characters
A major signal for grouping is a trait that evolved in a common ancestor and is inherited by all its descendants.
Synapomorphies (shared derived traits)
A synapomorphy supports a clade because it indicates a shared evolutionary origin.
Synapomorphy: A shared, derived character state present in multiple taxa because it evolved in their most recent common ancestor.
Key implications for constructing phylogenies:
If several taxa share a synapomorphy, they are hypothesised to form a clade (monophyletic group).
The inferred “gain” of that trait is placed on the branch leading to their common ancestor.

A vertebrate phylogeny with derived character states mapped onto branches as hash marks (e.g., four legs, amniote egg, hair, mammary glands, live birth). Each labeled mark represents an inferred evolutionary change used to support a node (clade) and to visualize where traits were gained along lineages. This is the basic logic behind character mapping/optimisation on trees. Source
The more independent characters that support the same grouping, the stronger the phylogenetic hypothesis.
Lost traits: reversals and secondary loss
Trait absence can be informative, but “loss” is often easier to evolve than complex “gain,” and it can occur multiple times independently.
How loss is represented on trees
When a trait is inferred to have been present in an ancestor but absent in a descendant lineage, it is mapped as a loss (sometimes called a reversal if it returns to an ancestral-like state). Loss matters because it can:
Break an expected pattern (a descendant lineage lacks a trait that close relatives retain).
Create misleading similarity (multiple lineages independently lose the same trait and appear similar by absence).
Common biological routes to loss:
Regulatory changes reducing gene expression (trait not produced)
Deletions or disabling mutations in coding sequences
Developmental changes that eliminate a structure (secondary simplification)
Mapping gains and losses: character optimisation
Once a candidate branching order is proposed, each character is “optimised” by placing the minimum (or most likely) number of changes on branches.
The logic of parsimony
A widely taught approach is maximum parsimony, which prefers the tree requiring the fewest evolutionary steps (fewest gains/losses across all characters). In practice:
A single shared gain can be strong evidence for relatedness.
Multiple independent gains or multiple independent losses are considered less parsimonious (but still possible).
Parsimony is most useful when:
Many independent characters are sampled
Characters are clearly homologous (same trait, not superficially similar)
Pitfalls: homoplasy and trait ambiguity
Not all shared traits reflect shared ancestry. Some similarities arise through homoplasy, which complicates deciding whether a trait was gained once or multiple times, or gained then lost.
Homoplasy (independent evolution)
Homoplasy includes:

A teaching cladogram that pairs a presence/absence character list with a tree showing where characters are inferred to originate on branches. Because it also highlights a convergent trait (a homoplasy), it makes clear why shared similarity is not always evidence of shared ancestry. This kind of diagram helps distinguish true synapomorphies from repeated, independent evolution. Source
Convergent evolution: similar trait evolves independently in different lineages
Parallel evolution: similar developmental/genetic routes produce similar traits independently
Reversals: derived state changes back to an ancestral-like state
How homoplasy affects gains/losses:
A shared trait might represent multiple independent gains rather than one gain.
An absent trait might represent multiple independent losses rather than shared ancestry.
Some characters (especially those under strong selection) may be more prone to repeated evolution, increasing homoplasy risk.
Choosing and coding traits to reflect real gains and losses
Trait choice and coding strongly influence where gains/losses are inferred.
Guidelines for high-utility characters
Use heritable traits (genetic/molecular or reliably inherited morphology), not environmentally induced differences.
Prefer traits likely to be homologous (same underlying structure or sequence position).
Sample many characters to reduce the impact of any single misleading gain/loss pattern.
Code multi-state traits carefully (e.g., “0, 1, 2”) so that inferred transitions represent biologically reasonable changes.
Molecular traits as gain/loss information
DNA and protein characters can reflect:
Substitutions (state changes at sites)
Insertions/deletions (gain/loss of sequence segments)
Gene gains/losses across genomes
These molecular changes can be mapped onto branches the same way as morphological gains and losses, using patterns of shared derived states to propose clades.
FAQ
They look for evidence that ancestors had the trait, such as developmental traces, pseudogenes, or conserved regulatory elements.
Comparative anatomy and genomics can reveal remnants consistent with secondary loss.
If states are treated as unordered, any state can change directly to any other, possibly reducing inferred steps.
If ordered (e.g., 0→1→2), transitions imply sequential changes, often increasing inferred steps and shifting where gains/losses are mapped.
Traits vary in their likelihood of repeated evolution.
Researchers may down-weight characters prone to homoplasy (e.g., traits under strong similar selection) and up-weight more conservative characters, but weighting can introduce subjectivity.
An insertion can be treated as a “gain” of a sequence segment; a deletion as a “loss.”
Shared indels at the same genomic location can be strong evidence of common ancestry because identical indels are less likely to occur independently.
Character conflict occurs when different traits support different branching patterns because their inferred gains/losses disagree.
Analysts may:
Increase character sampling
Re-check homology assumptions and alignment
Use alternative models/methods and compare support for competing mappings
Practice Questions
In a phylogeny, taxa B, C, and D share a derived trait state not found in A or E. Explain what this suggests about B, C, and D, and how the trait is used on the tree. (2 marks)
Identifies the trait as a shared derived character/synapomorphy supporting a clade containing B, C, and D. (1)
States the trait is inferred to have arisen (a gain) in the most recent common ancestor of B, C, and D and mapped to that branch. (1)
A proposed tree shows that trait X is present in taxa 1, 2, and 5 but absent in taxa 3 and 4. Explain two different evolutionary scenarios (in terms of gains and/or losses) that could produce this pattern, and state how parsimony would evaluate them. (5 marks)
Scenario 1 describes one gain of trait X in an ancestor of 1, 2, 3, 4, 5 with at least one subsequent loss in the lineage leading to 3 and 4 (or separate losses). (1)
Scenario 2 describes independent gains of trait X in the lineage leading to 1 and 2 and separately in the lineage leading to 5 (or other valid multiple-gain explanation). (1)
Correctly links each scenario to different numbers of evolutionary steps (changes). (1)
States that maximum parsimony prefers the scenario with fewer total gains/losses (fewer steps). (1)
Notes that repeated losses or gains represent homoplasy and can mislead inference if not considered across many characters. (1)
